Loading lesson...
Split, join, and master text processing
Split, join, and master text processing
Topics covered: Splitting Strings, Joining Strings, Split and Join Patterns, Advanced Formatting, String Encoding
Custom Delimiter Splitting Pass a delimiter string to split on specific characters: The first split produces ['apple', 'banana', 'cherry', 'date']. The path split produces ['', 'home', 'user', 'documents', 'file.txt'] - note the empty string from the leading slash. Limiting Splits This splits only at the first 2 colons, producing ['ERROR', '2024-01-15', 'Database connection failed: timeout']. The message with colons stays intact. Splitting with splitlines() This produces ['Line 1', 'Line 2', 'Li
This produces "Python is awesome". The space " " is inserted between each word. The join() Syntax This is one of the most common Python mistakes. Test your ability to spot and fix it in the challenge below. Common Join Patterns Different separators for different use cases: These create: "apple,banana,cherry", "home/user/docs", "apple<br>banana<br>cherry", and a multi-line string with each item on its own line. Empty String Join Joining with an empty string concatenates elements directly: This pr
Combining split() and join() enables powerful text transformations. This pattern is used constantly in real-world code. Changing Delimiters Convert between different delimited formats: Normalizing Whitespace Replace multiple spaces with single spaces: Transforming Each Element Process each part before joining back: These produce "Maya Johnson" and "M.J." respectively. The pattern splits, transforms each piece, and rejoins. Now try filling in the blank to convert a snake_case variable name to Tit
F-strings support advanced formatting for alignment, padding, and number presentation. These features create professional-looking output. Alignment and Padding Control how values are positioned within a fixed width: Number Formatting Format numbers with precision, separators, and signs: These produce: "Pi: 3.1416" (4 decimals), "Big: 1,234,567,890" (comma separators), "Signed: -42" (explicit sign), and "Padded: 00042" (zero-padded). Percentage and Scientific For financial applications, consider
Computers store text as bytes, not characters. Encoding is the process of converting characters to bytes. Understanding encoding prevents mysterious bugs when working with files, APIs, and databases. Strings vs Bytes encode() and decode() UTF-8 is the most common encoding. It handles all Unicode characters and is the default for web and most modern systems. Unicode Characters Unicode supports characters from all languages and emojis: Using the wrong encoding produces a distinctive class of bugs.