String Functions: Intermediate

Reddit's Trust & Safety data team identifies coordinated inauthentic behavior and spam rings by running SQL pattern matching queries across billions of posts stored in their data warehouse. Functions like REPLACE, LIKE with wildcards, and REGEXP_LIKE let analysts surface policy-violating phrases, off-platform links, and bot-like posting patterns across hundreds of communities in a single query. Reddit engineers have discussed how batch SQL analysis lets a small team audit content at scale and dramatically reduce the case volume that reaches human reviewers. The intermediate string techniques you are about to learn are the same tools that make this kind of large-scale content analysis possible.

SUBSTRING()

Daily Life
Interviews

Extract portions of text by position

SUBSTRING extracts a portion of a string starting at a specified position for a specified length. It allows you to pull out exactly the characters you need from any position within the text. This function is fundamental for parsing structured data embedded in strings.

The syntax is SUBSTRING(string, start_position, length). The start_position indicates where extraction begins (1-indexed, meaning the first character is position 1). The length specifies how many characters to extract. If length is omitted, extraction continues to the end of the string.

Understanding how positions and lengths work in SUBSTRING is key to extracting exactly the characters you need.

//

Basic Extraction

Consider product codes that embed category information in specific positions.
product_codeFormat
ELEC-2024-001CATEGORY-YEAR-SEQUENCE
FURN-2024-042CATEGORY-YEAR-SEQUENCE
TOYS-2023-199CATEGORY-YEAR-SEQUENCE
1SELECT
2 product_code,
3 SUBSTRING(product_code, 1, 4) AS category,
4 SUBSTRING(product_code, 6, 4) AS yr
5FROM products
Result
product_codecategoryyr
ELEC-2024-001ELEC2024
FURN-2024-042FURN2024
TOYS-2023-199TOYS2023

SUBSTRING(product_code, 1, 4) extracts 4 characters starting at position 1, returning the category code. SUBSTRING(product_code, 6, 4) extracts 4 characters starting at position 6, returning the year. Position counting starts at 1, not 0.

//

Extract to String End

When you omit the length parameter, SUBSTRING extracts from the start position to the end of the string:

With length specified as 5, exactly 5 characters are extracted starting at position 7.
1SELECT
2 SUBSTRING('hello world', 7, 5) AS result

Practical Applications

SUBSTRING becomes powerful when combined with other functions like STRPOS to handle variable-length data.

//

Parsing Email Domains

SUBSTRING combined with STRPOS can extract variable-length portions of strings:

1SELECT
2 email,
3 SUBSTRING(
4 email,
5 STRPOS(email, '@') + 1
6 ) AS domain
7FROM users
Result
emaildomain
alice@gmail.comgmail.com
bob@company.orgcompany.org
carol@school.eduschool.edu
This query finds the position of the @ symbol, adds 1 to skip past it, then extracts everything from that point to the end. The domain portion has variable length, so omitting the length parameter is essential.
TIP
Position counting starts at 1 in SQL, not 0 like most programming languages. SUBSTRING('hello', 1, 1) returns 'h', not 'e'. This is a frequent source of off-by-one errors.
//

Fixed-Width Extraction

Many legacy systems use fixed-width formats where each field occupies specific character positions:
recordLayout
SMITH JOHN 19850315NYCLastName(10) FirstName(10) DOB(8) City(3)
1SELECT
2 TRIM(SUBSTRING(record, 1, 10)) AS last_name,
3 TRIM(SUBSTRING(record, 11, 10)) AS first_name,
4 SUBSTRING(record, 21, 8) AS birth_date,
5 SUBSTRING(record, 29, 3) AS city_code
6FROM legacy_records
Result
last_namefirst_namebirth_datecity_code
SMITHJOHN19850315NYC

Each field is extracted by its position and length. TRIM removes the padding spaces from the name fields. This pattern is common when migrating data from mainframe systems.

Best Practices

SUBSTRING excels at structured parsing tasks but requires careful attention to position counting.

//

When to Use SUBSTRING()

SUBSTRING is ideal for data with predictable structure where you know exactly where values are located.

Structured Identifiers
Structured Identifiers
Extract embedded codes from product IDs, order numbers, or SKUs
Fixed-Width Parsing
Fixed-Width Parsing
Parse legacy record formats with fixed character positions
Composite Keys
Composite Keys
Break apart composite keys into individual components
URLs and Paths
URLs and Paths
Extract portions of URLs, file paths, or address strings

Understanding SUBSTRING execution helps you predict behavior with edge cases like NULL inputs or out-of-range positions.

How SUBSTRING() Executes
  • SUBSTRING() scans to the start position, then copies the specified length
  • Returns NULL if the input string is NULL
  • Returns empty string if start position exceeds string length
  • Negative start positions work differently across databases
This function is a workhorse in real-world data pipelines.

> Complete this query to extract the category code from a product code.

SELECT
  ___(product_code, ___, ___) AS category
FROM products
4
SUBSTRING
6
1

SUBSTRING with a fixed start position and length is the right tool when your data follows a consistent structure where each field always occupies the same character positions.

Combining SUBSTRING with STRPOS lets you handle variable-length strings by first finding a delimiter and then extracting relative to that location rather than relying on fixed positions.

SQL positions start at 1, not 0, so if you are used to zero-indexed programming languages, remember to shift your position arguments by 1 to avoid off-by-one extraction errors.

Edge Extractions

Daily Life
Interviews

Pull text from the start or end of values

Extracting characters from the beginning or end of a string is a common operation. SUBSTR handles both cases. To get the first N characters, use SUBSTR(string, 1, N). To get the last N characters, use a negative start position: SUBSTR(string, -N).

Dedicated prefix/suffix functions are found in some databases, but Presto provides unified handling through SUBSTR with position arguments. This single function replaces the need for multiple specialized functions, giving you consistent syntax across all extraction patterns.

Using SUBSTR from position 1 with a specified length captures the prefix portion of any string with clarity and consistency.

//

Extract from Start

SUBSTR(string, 1, N) extracts the first N characters, ideal for fixed-length prefixes like country codes, category identifiers, or version numbers:

phone_number
+1-555-123-4567
+44-20-7946-0958
+81-3-1234-5678
1SELECT
2 phone_number,
3 SUBSTR(phone_number, 1, 3) AS country_code
4FROM contacts
Result
phone_numbercountry_code
+1-555-123-4567+1-
+44-20-7946-0958+44
+81-3-1234-5678+81

SUBSTR(phone_number, 1, 3) extracts the first 3 characters, capturing the country code prefix. The 1 indicates starting at the first character, and 3 is the length to extract.

RIGHT()

Extracting from the end requires calculating the start position or using negative indices, depending on your database.
//

Extracting from the End

SUBSTR with a negative start position extracts from the end of a string, useful for file extensions, trailing codes, or check digits:

filename
report_2024.pdf
data_export.csv
backup.sql.gz
1SELECT
2 filename,
3 SUBSTR(
4 filename,
5 LENGTH(filename) - 2,
6 3
7 ) AS extension
8FROM files
Result
filenameextension
report_2024.pdfpdf
data_export.csvcsv
backup.sql.gz.gz

SUBSTR(filename, -3) extracts the last 3 characters. The negative position -3 means "start 3 characters from the end." For fixed-length extensions like .pdf or .csv, this works perfectly. For variable-length extensions, you would need STRPOS to find the dot.

Comparison & Best Practices

Understanding the differences between extraction methods helps you choose the right approach for each situation.
//

Extraction Patterns

Compare how different SUBSTR calls extract from the beginning, end, or middle of a string.

SUBSTR(str, 1, 4) starts from position 1 and extracts 4 characters from the beginning.
1SELECT
2 SUBSTR('ABCDEFGH', 1, 4) AS result
//

Practical Examples

Extracting order year and sequence number from order IDs:
1SELECT
2 order_id,
3 SUBSTR(order_id, 1, 4) AS order_year,
4 SUBSTR(
5 order_id,
6 LENGTH(order_id) - 5,
7 6
8 ) AS sequence_num
9FROM orders
Result
order_idorder_yearsequence_num
2024-0000012024000001
2024-0000422024000042
2023-0012342023001234

To get the first N characters, use SUBSTR(str, 1, N). To get the last N characters, use SUBSTR(str, -N). These patterns handle all the use cases needed for extracting fixed-width prefixes and suffixes from standardized identifiers.

//

Extracting Prefixes and Suffixes

Prefixes (the leading portion of a string) and suffixes (the trailing portion) are common in structured data. Country codes, file extensions, year prefixes, check digits, and category codes are all prefixes or suffixes that can be extracted using SUBSTR.

CountryExtensionYearCheck DigitCategory
Country
Region Prefix
Extract leading symbols
Extension
File Suffix
Pull trailing file types
Year
Date Prefix
Parse leading year digits
Check Digit
Trailing Digit
Validate ending numbers
Category
Embedded Code
Extract category segment

SUBSTR execution differs slightly from SUBSTRING in some databases, particularly with negative positions.

How SUBSTR() Executes
  • SUBSTR() with positive positions scans from the start
  • SUBSTR() with negative positions counts backward from the end
  • Returns NULL if the input is NULL
  • If requested length exceeds string length, the available portion is returned

> Complete this query to extract the first 3 characters of each product code as the category.

SELECT
  ___(product_code, ___, ___) AS category
FROM products
5
1
SUBSTR
3
Extracting from the end of a string with a negative start position is a compact pattern for pulling file extensions or trailing codes without needing to know the total length of the string.

When the position of a delimiter is unknown in advance, combining STRPOS to locate it and SUBSTR to extract around it produces a robust parser that works correctly regardless of how long the prefix or suffix is.

Edge extraction patterns like pulling the first or last N characters appear constantly in ETL pipelines when parsing structured identifiers, timestamps embedded in filenames, or region prefixes in order codes.

REPLACE()

Daily Life
Interviews

Swap characters or words inside text

REPLACE finds all occurrences of a substring and replaces them with a different substring. Syntax: REPLACE(string, search_string, replacement_string). The comparison is case-sensitive, and every match is replaced globally throughout the string.

REPLACE is essential for data cleaning, standardization, and text transformation workflows.

//

Basic Replacement

Standardizing company names by expanding abbreviations.
company_name
Acme Corp.
Tech Inc.
Data Corp. Systems
1SELECT
2 company_name,
3 REPLACE(
4 company_name,
5 'Corp.',
6 'Corporation'
7 ) AS standardized
8FROM companies
Result
company_namestandardized
Acme Corp.Acme Corporation
Tech Inc.Tech Inc.
Data Corp. SystemsData Corporation Systems

Every occurrence of 'Corp.' is replaced with 'Corporation'. The third row shows that REPLACE works even when the search string appears in the middle. Rows without matches are unchanged.

//

Removing Characters

To remove characters, replace them with an empty string:
1SELECT
2 phone_number,
3 REPLACE(
4 REPLACE(
5 REPLACE(phone_number, '-', ''),
6 '(',
7 ''
8 ),
9 ')',
10 ''
11 ) AS digits_only
12FROM contacts
Result
phone_numberdigits_only
(555) 123-4567555 1234567
555-987-65435559876543
(800) 555-0123800 5550123

Chaining REPLACE calls removes multiple different characters. Each call removes one type of character. The result contains only digits and spaces (spaces would require another REPLACE to remove).

Advanced Usage

REPLACE becomes more powerful when chained together or combined with other string functions.

//

Data Standardization

REPLACE is commonly used to convert between data formats or fix inconsistent input:

1SELECT
2 date_text,
3 REPLACE(date_text, '/', '-') AS iso_format
4FROM imports
Result
date_textiso_format
2024/01/152024-01-15
2024/12/312024-12-31
2023/06/012023-06-01
Date formats with slashes are converted to ISO format with hyphens. This standardization is critical when importing data from systems with different conventions.
//

Single vs Chained

A single REPLACE handles one substitution. For multiple characters, you chain calls. Compare the approaches:

A single REPLACE removes one type of character. This query strips hyphens from phone numbers, turning "555-123-4567" into "5551234567".
1SELECT
2 REPLACE(phone, '-', '') AS cleaned
3FROM contacts
TIP
REPLACE() is case-sensitive in most databases. REPLACE('Hello', 'HELLO', 'Hi') returns 'Hello' unchanged because 'HELLO' does not match 'Hello'. Use LOWER() or UPPER() if you need case-insensitive replacement.
//

Multiple Replacements

For multiple different replacements, nest REPLACE calls:

1SELECT
2 REPLACE(
3 REPLACE(
4 REPLACE(status, 'P', 'Pending'),
5 'A',
6 'Approved'
7 ),
8 'R',
9 'Rejected'
10 ) AS full_status
11FROM orders
Result
statusfull_status
PPending
AApproved
RRejected

Each REPLACE() transforms one code. The innermost replacement runs first, then the next, and so on. Order matters if replacements might overlap.

Best Practices

REPLACE is a workhorse function for data cleaning, but knowing when to use it versus other approaches matters.

//

When to Use REPLACE()

Compare your data before and after applying REPLACE to understand its impact on consistency.

Before REPLACE()
  • Inconsistent abbreviations
  • Unwanted characters in data
  • Mixed date/number formats
  • Legacy single-char codes
After REPLACE()
  • Standardized terminology
  • Clean, uniform strings
  • Consistent date formats
  • Human-readable values

REPLACE performs a global search-and-replace in a single pass, making it efficient for bulk text transformations.

How REPLACE() Executes
  • REPLACE() scans the string once, replacing all matches
  • Returns the original string if no matches found
  • Returns NULL if any argument is NULL
  • Case-sensitive; can increase or decrease string length

Data standardization is one of the most common real-world applications of REPLACE.

> Complete this query to replace underscores with spaces in product names.

SELECT
  ___(product_name, ___, ___) AS clean_name
FROM products
' '
REPLACE
'_'
'-'
REPLACE() Behavior
  • Performs global substitution: every occurrence is replaced in a single pass
  • Chain multiple REPLACE calls to strip multiple unwanted characters (standard ETL pattern)
  • Case-sensitive: normalize with LOWER() or UPPER() first if source data has inconsistent casing

REPLACE substitutes every occurrence in one pass, not just the first. This global behavior is what makes it effective for bulk data cleaning across entire columns.

Nesting REPLACE() calls is the standard approach for removing or substituting multiple different characters in a single query.

STRPOS()

Daily Life
Interviews

Locate where a pattern appears in text

STRPOS finds the starting position of a substring within a larger string. Syntax: STRPOS(string, substring). Returns the 1-indexed position of the first match, or 0 if not found.

STRPOS is crucial for locating delimiters, parsing variable-format strings, and building conditional logic based on string contents. The position it returns can then be used directly in other string functions.

//

Delimiter Positions

STRPOS is essential for parsing strings with delimiters.

email
alice@gmail.com
bob@company.org
support@help.desk.com
1SELECT
2 email,
3 STRPOS(email, '@') AS at_position
4FROM users
Result
emailat_position
alice@gmail.com6
bob@company.org4
support@help.desk.com8

STRPOS() returns where the @ symbol appears. The position varies by email length. In 'alice@gmail.com', @ is at position 6. In 'bob@company.org', @ is at position 4.

//

Extraction by Position

Combine STRPOS with SUBSTRING to extract variable-length portions:

1SELECT
2 email,
3 SUBSTRING(
4 email,
5 1,
6 STRPOS(email, '@') - 1
7 ) AS username,
8 SUBSTRING(
9 email,
10 STRPOS(email, '@') + 1
11 ) AS domain
12FROM users
Result
emailusernamedomain
alice@gmail.comalicegmail.com
bob@company.orgbobcompany.org
support@help.desk.comsupporthelp.desk.com

The username is extracted from position 1 to just before the @. The domain is extracted from just after the @ to the end. STRPOS provides the exact location needed for these calculations.

Advanced Usage

Beyond simple lookups, STRPOS enables pattern-based logic and complex parsing strategies.

//

Substring Existence

STRPOS returns 0 when the substring is not found, making it useful for conditional logic:

1SELECT
2 product_name,
3 CASE
4 WHEN STRPOS(product_name, 'Pro') > 0 THEN 'Professional'
5 WHEN STRPOS(product_name, 'Basic') > 0 THEN 'Standard'
6 ELSE 'Unknown'
7 END AS tier
8FROM products
Result
product_nametier
Widget Pro XProfessional
Basic WidgetStandard
Super WidgetUnknown

If STRPOS returns a value greater than 0, the substring exists. This pattern enables classification based on string contents without exact matching.

STRPOS returns 0 when not found, but returns 1 for a match at the beginning. Remember that SQL positions start at 1, so a match at the first character returns 1, not 0.
//

Multiple Delimiters

When parsing complex strings with multiple delimiters, STRPOS helps locate each one:

1SELECT
2 file_path,
3 STRPOS(file_path, '/') AS first_slash
4FROM files
Result
file_pathfirst_slash
/home/user/data.csv1
data/exports/file.json5
backup.sql0

STRPOS finds the first occurrence only. For paths with multiple slashes, you would extract portions sequentially or use database-specific functions for splitting.

Best Practices

STRPOS shines in data parsing tasks where you need to locate dynamic boundaries within strings.

//

When to Use STRPOS()

These are the most common scenarios where STRPOS proves invaluable.

Locate Delimiters
Locate Delimiters
Find the position of separators for substring extraction
Pattern Checking
Pattern Checking
Check if a string contains a specific pattern
Variable-Format Parsing
Variable-Format Parsing
Parse emails, URLs, or other dynamic-length data
Conditional Logic
Conditional Logic
Make decisions based on string contents

STRPOS scans from the start of the string, so its performance depends on where matches occur.

How STRPOS() Executes
  • STRPOS() scans the string sequentially until a match is found
  • Returns 0 if not found; returns 1 if match at first character
  • Case-sensitive in most databases; finds first occurrence only

> Complete this query to find the position of the @ symbol in email addresses.

SELECT
  ___(email, ___) AS at_pos
FROM users
INSTR
FROM
'@'
STRPOS
STRPOS() Advanced Patterns
  • Returns 0 (not NULL) when substring not found; compare result to 0 rather than using IS NULL
  • Most powerful when used as input to SUBSTRING() for dynamic extraction based on found position
  • Only finds first occurrence; parsing multiple delimiters requires nesting calls or using REPLACE() first

STRPOS returns 0 when the substring is not found. This means STRPOS(col, 'x') > 0 reliably tests for presence, while comparing the result to NULL would never match.

The most powerful use of STRPOS() is feeding its result to SUBSTRING(): by finding a delimiter position dynamically, you can extract variable-length substrings that depend on the actual content of each row.

TRIM()

Daily Life
Interviews

Remove unwanted whitespace from values

TRIM removes unwanted characters from the beginning, end, or both sides of a string. By default it removes whitespace. Syntax: TRIM(string) or TRIM(LEADING/TRAILING/BOTH characters FROM string) for custom characters. LTRIM and RTRIM are shortcuts.

TRIM is essential for cleaning user input and ensuring accurate string comparisons, especially when dealing with legacy data that has inconsistent whitespace.

//

Removing Whitespace

Whitespace problems are invisible but cause real bugs. Extra spaces prevent joins from matching and corrupt comparisons.
usernameDisplay
alice " alice " (spaces on both sides)
bob "bob " (trailing spaces)
carol" carol" (leading spaces)
1SELECT
2 username,
3 TRIM(username) AS cleaned
4FROM users
Result
usernamecleaned
alice alice
bob bob
carolcarol

TRIM removes spaces from both ends. All usernames are now clean and comparable. Without trimming, ' alice ' would not match 'alice' in a WHERE clause or JOIN.

//

LTRIM() and RTRIM()

LTRIM removes leading characters only. RTRIM removes trailing characters only. These are useful when you need asymmetric cleaning:

TRIM removes whitespace from both sides, leaving just 'hello'.
1SELECT
2 TRIM(' hello ') AS result
//

Custom Character Trim

You can specify which characters to remove instead of whitespace:
1SELECT
2 price_text,
3 TRIM(LEADING '$' FROM price_text) AS amount
4FROM products
Result
price_textamount
$99.9999.99
$$50.00$50.00
$1,299.001,299.00

TRIM(LEADING '$' FROM ...) removes dollar signs from the beginning. Note that the second row has two dollar signs, and only the leading ones are removed. TRIM removes all consecutive matching characters from the specified side.

Practical Applications

TRIM is a staple in data pipelines and ETL workflows. Here are common patterns you will encounter.

//

Cleaning Import Data

Data imported from spreadsheets or external systems often has invisible whitespace that breaks joins:
1SELECT
2 a.product_id,
3 b.product_name
4FROM orders AS a
5INNER JOIN products AS b
6 ON TRIM(a.product_id) = TRIM(b.product_id)
Result
product_idproduct_name
P001Widget
P002Gadget

When joins mysteriously return no results, whitespace is often the culprit. Applying TRIM() to both sides of the join ensures spaces do not prevent matches.

//

Import vs Query TRIM

You can apply TRIM when data is loaded or when it is queried. Each approach has trade-offs for performance and data integrity.

Trimming at query time is flexible but slower. The database must trim every value on every query, and indexes cannot be used on the trimmed result.
1SELECT
2 a.id,
3 b.name
4FROM orders AS a
5INNER JOIN products AS b
6 ON TRIM(a.product_id) = TRIM(b.product_id)
TIP
Applying TRIM in JOIN conditions prevents index usage. For production systems, clean data at import time and store trimmed values. Use TRIM in joins only for debugging or one-time data fixes.

Best Practices

TRIM is one of the most frequently used functions in data cleaning pipelines. Here is when to apply it.

//

When to Use TRIM()

These are the most common scenarios where whitespace causes problems.
InputJoinsPrefixDedupImport
Input
User Cleanup
Strip unwanted spaces
Joins
Fix Matching
Remove hidden whitespace
Prefix
Strip Symbols
Remove leading characters
Dedup
Find Duplicates
Normalize before compare
Import
ETL Cleaning
Sanitize external sources

TRIM is highly efficient because it only examines characters at the string edges, not the entire content.

How TRIM() Executes
  • TRIM() scans from the edges inward until non-matching characters
  • Returns NULL if input is NULL
  • Very efficient as it only examines edge characters; does not affect middle

Invisible whitespace bugs are notoriously hard to debug, making TRIM an essential defensive practice.

> Complete this query to remove whitespace from usernames.

SELECT
  ___(username) AS clean_name
FROM users
RTRIM
TRIM
LTRIM
These guidelines help you write efficient and maintainable string manipulation code.
Do
  • Use SUBSTR(str, 1, N) for first N chars, SUBSTR(str, -N) for last N
  • Combine STRPOS() with SUBSTR() for dynamic parsing
  • Apply TRIM() to user input before storage
  • Chain REPLACE() calls for multiple substitutions
  • Test extraction logic with edge cases (empty strings, NULLs)
Don't
  • Assume fixed positions without validating data format
  • Forget that SQL positions start at 1, not 0
  • Use TRIM() in JOIN conditions without understanding the performance impact
  • Ignore case sensitivity in REPLACE() and STRPOS()
  • Apply SUBSTR() with hardcoded positions to variable-format data
String extraction and transformation are among the most common operations in data pipelines. Put these techniques to the test with hands-on challenges.
PUTTING IT ALL TOGETHER

> You are a data engineer at FedEx parsing and normalizing raw address and tracking strings scraped from carrier API responses before loading them into the central warehouse for shipment analytics.

SUBSTRING() with fixed positions extracts the carrier code and service tier embedded at known character offsets within each standardized tracking number format.
STRPOS() locates the delimiter between street address and city in variable-length address strings, giving SUBSTRING() a dynamic start point for the extraction.
REPLACE() strips formatting characters such as parentheses and hyphens from raw phone strings and converts slash-delimited date fields to ISO hyphen format.
TRIM() removes invisible leading and trailing whitespace from carrier response fields that would otherwise break joins against the reference address table.
KEY TAKEAWAYS
SUBSTR() extracts text; positions are 1-indexed, not 0-indexed
Use SUBSTR(str, 1, N) for first N chars, SUBSTR(str, -N) for last N
STRPOS() returns 0 if not found, not NULL (unlike some languages)
REPLACE() is case-sensitive; replaces all occurrences, not just the first
TRIM() removes only leading/trailing whitespace, not internal spaces
Use LTRIM() and RTRIM() when you need directional control
Hidden whitespace causes mysterious join failures; TRIM() both sides before joining
Combine these functions for complex parsing: SUBSTR + STRPOS is powerful

Slice, search, and reshape text at will

SUBSTR, TRIM, and string surgery 101

Category
SQL
Difficulty
intermediate
Duration
30 minutes
Challenges
0 hands-on challenges

Topics covered: SUBSTRING(), Edge Extractions, REPLACE(), STRPOS(), TRIM()

Lesson Sections

  1. SUBSTRING() (concepts: sqlSubstring)

    Basic Extraction Consider product codes that embed category information in specific positions. Extract to String End Practical Applications Parsing Email Domains This query finds the position of the @ symbol, adds 1 to skip past it, then extracts everything from that point to the end. The domain portion has variable length, so omitting the length parameter is essential. Fixed-Width Extraction Many legacy systems use fixed-width formats where each field occupies specific character positions: Best

  2. Edge Extractions (concepts: sqlStringBuilding)

    Extract from Start RIGHT() Extracting from the end requires calculating the start position or using negative indices, depending on your database. Extracting from the End Comparison & Best Practices Understanding the differences between extraction methods helps you choose the right approach for each situation. Extraction Patterns Practical Examples Extracting order year and sequence number from order IDs: Extracting Prefixes and Suffixes Extracting from the end of a string with a negative start p

  3. REPLACE() (concepts: sqlReplace)

    Basic Replacement Standardizing company names by expanding abbreviations. Removing Characters To remove characters, replace them with an empty string: Advanced Usage Data Standardization Date formats with slashes are converted to ISO format with hyphens. This standardization is critical when importing data from systems with different conventions. Single vs Chained Multiple Replacements Best Practices When to Use REPLACE()

  4. STRPOS() (concepts: sqlStrpos)

    Delimiter Positions Extraction by Position Advanced Usage Substring Existence STRPOS returns 0 when not found, but returns 1 for a match at the beginning. Remember that SQL positions start at 1, so a match at the first character returns 1, not 0. Multiple Delimiters Best Practices When to Use STRPOS()

  5. TRIM() (concepts: sqlTrim)

    Removing Whitespace Whitespace problems are invisible but cause real bugs. Extra spaces prevent joins from matching and corrupt comparisons. LTRIM() and RTRIM() Custom Character Trim You can specify which characters to remove instead of whitespace: Practical Applications Cleaning Import Data Data imported from spreadsheets or external systems often has invisible whitespace that breaks joins: Import vs Query TRIM Best Practices When to Use TRIM() These are the most common scenarios where whitespa