Data Types: Beginner
In Twitter's early days, the platform stored user and tweet IDs as 32-bit integers, a choice that seemed perfectly reasonable for a small startup. By 2014 those IDs had overflowed 2.1 billion, the maximum an INTEGER can hold, and Twitter engineers had to execute an emergency migration to 64-bit BIGINT across billions of rows while the site stayed live. The incident, which contributed to the infamous fail whale outages, is now a classic case study in why data type choices made on day one follow a product for years. In this lesson you will learn the core SQL types, understand exactly what each one stores, and build the instinct to choose correctly before scale turns a small decision into a crisis.
Why data types matter
Prevent errors from mismatched types
Type Mismatch Consequences
What happens when you store a number as VARCHAR instead of INTEGER?
Arithmetic Proves the Type
One quick way to verify a column is stored as a number is to use it in arithmetic. If a column is really an integer, adding or subtracting will work correctly.
> Complete this query to calculate each product's price per unit by dividing total cost by quantity.
SELECT product, total_cost AS unit_price FROM orders
INTEGER for whole numbers
Store and compute with numeric data
An INTEGER stores whole numbers -- numbers without decimal points. Think of it like a counter that only shows complete units: 1, 2, 3, not 1.5.
Integer Applications
- Unique identifiers:
user_id,order_id - Reference columns:
customer_idin an orders table - Counts and quantities:
stock_count,view_count - Years and ages:
birth_year,age - Status codes: 1=pending, 2=complete
INTEGER: Always 4 bytes. Range: -2,147,483,648 to 2,147,483,647 (~2.1 billion). If you need larger values, use BIGINT (8 bytes, up to 9 quintillion).> Complete the query to calculate total value by combining two numeric columns.
SELECT test_name, value AS total FROM ab_results
INTEGER is the workhorse of SQL schemas. It supports the full range of arithmetic operations and sorts numerically, making it ideal for counts, ranks, and quantities.
BIGINT provides a larger range at the cost of double the storage. Use it when values can exceed 2.1 billion, such as globally unique identifiers or high-volume event counters.
STRING types (VARCHAR)
Handle text of varying lengths
VARCHAR stores text -- a sequence of characters (often called a "string"). Unlike integers which have a fixed size, text values vary in length: "Hi" is 2 characters while "Hello World" is 11.
Choosing String Types
- Variable length up to n chars
- Stores only actual content
- Most common string type
- Use for: names, emails
- Fixed length, always n chars
- Pads short values with spaces
- Faster for fixed-width data
- Use for: country codes (US)
VARCHAR(100) can store 100 ASCII characters but only 25-50 emojis. Plan for your actual character set.> Complete this query to select each user's email with a friendlier column name.
SELECT username, contact_email FROM users
VARCHAR is the most common string type because it stores only the actual content, not a fixed allocation. This efficiency matters at scale where millions of rows with variable-length strings add up quickly.
When sizing VARCHAR columns, audit your actual data first. Choosing VARCHAR(50) for a field that only ever holds 10 characters wastes both index space and makes comparisons slower.
BOOLEAN for true/false
Represent yes/no decisions in data
A BOOLEAN represents logical truth values: TRUE or FALSE. Think of it like a light switch -- it can only be on or off, nothing in between.
TRUE and FALSE are reserved SQL keywords, not strings. You will learn to combine them with logical operators in later lessons.
Boolean Best Practices
Using BOOLEAN properly improves both storage efficiency and code clarity.
- Use BOOLEAN for true/false columns
- Name boolean columns clearly: is_active, has_paid
- Use BOOLEAN for yes/no flags instead of INTEGER 0/1
- Store "true"/"false" as VARCHAR strings
- Use INTEGER 0/1 when BOOLEAN is available
- Forget that BOOLEAN columns can also be NULL
> Complete this query to select each user's ID and whether their account is active.
SELECT user_id, AS FROM users
Boolean columns consume only 1 byte each. When a table has many flag columns, using BOOLEAN instead of VARCHAR or INTEGER significantly reduces both storage and join key sizes.
SQL BOOLEAN can also be NULL, representing an unknown state. This three-valued logic (true, false, unknown) is important to handle correctly in WHERE clauses and aggregations.
CAST for type conversion
Convert values between types safely
CAST explicitly converts values from one type to another. This is essential when data is stored in the wrong type or when you need to perform type-specific operations.
CAST fails if conversion is impossible (e.g., CAST('abc' AS INTEGER). Use TRY_CAST for safe conversions that won't crash your query (covered in intermediate lesson).Practice Conversions
Apply CAST to convert between common types in these practice exercises.
> Complete this query to convert the text-based price into a whole number.
SELECT (price AS ) FROM products
CAST is explicit and safe when you know the data is valid. For untrusted data from external sources, TRY_CAST is preferable because it returns NULL on failure instead of crashing the query.
Type conversions happen at query time and have a cost. When the same CAST is needed repeatedly, consider storing the converted value in a dedicated column during the ETL process.
Mastering CAST and data types together means you can work confidently with data from any source, whether it arrives with the correct types or needs cleanup before analysis.
> You are a data engineer at Stripe designing the first production schema for a payments platform. You must choose column types that store user records and transaction amounts accurately without expensive migrations as data volume scales.
INTEGER vs DECIMAL determines whether transaction amounts store exact cents or silently lose precision through floating-point rounding errors.VARCHAR length limits on user_name and email columns prevent runaway storage costs while still accommodating the longest realistic inputs.BOOLEAN columns for is_active and is_verified store binary flags more efficiently than VARCHAR yes/no fields, saving bytes per row at scale.CAST converts imported string values from CSV onboarding files into proper numeric types so arithmetic on transaction totals works correctly.INTEGER is fastest for counting, IDs, and quantities (4 bytes, hardware-accelerated)VARCHAR(n) is the standard string type; use CHAR(n) only for fixed-length codes like country codesBOOLEAN stores TRUE/FALSE/NULL in 1 byte; use it for yes/no flagsCAST converts between types; use it when operations require specific typesData Types: Beginner
INT, VARCHAR, and the lies we tell
- Category
- SQL
- Difficulty
- beginner
- Duration
- 19 minutes
- Challenges
- 0 hands-on challenges
Topics covered: Why data types matter, INTEGER for whole numbers, STRING types (VARCHAR), BOOLEAN for true/false, CAST for type conversion
Lesson Sections
- Why data types matter (concepts: sqlBasicTypes)
Data types are not just labels. They fundamentally determine how your data is stored, processed, and queried. The database uses your type choice to decide everything from how much disk space to use, to whether math operations are allowed. SQL databases organize data into three fundamental categories. Every other type is a specialization of these: Type Mismatch Consequences Choosing the wrong type leads to unexpected behavior in operations like arithmetic and sorting. Arithmetic Proves the Type T
- INTEGER for whole numbers
Integer literals in SQL are written as bare numbers without quotes. SQL uses quotes to distinguish strings from numbers: 42 (no quotes) is a number, while '42' (with quotes) is text. Integer Applications Integers are the most common type for IDs, counts, and any whole number data. Understanding storage size helps you choose between integer types for performance-sensitive tables. Integer types are hardware-accelerated in modern CPUs. Operations on integers run faster than equivalent operations on
- STRING types (VARCHAR)
String literals in SQL are enclosed in single quotes. Double quotes are reserved for identifiers (table and column names). This is a common source of confusion for beginners. Choosing String Types Different string types suit different use cases based on length variability and storage needs. One of the most common string operations is measuring how long a value is. Try it yourself. Single quotes delimit string literals in SQL while double quotes delimit identifiers like table and column names. Mi
- BOOLEAN for true/false
Boolean Best Practices Choosing the right type also has real storage consequences at scale. Name boolean columns with a clear prefix like is_, has_, or can_ so readers immediately know the column holds a true/false value without reading documentation.
- CAST for type conversion (concepts: sqlCast)
Practice Conversions