Data Types: Beginner

In Twitter's early days, the platform stored user and tweet IDs as 32-bit integers, a choice that seemed perfectly reasonable for a small startup. By 2014 those IDs had overflowed 2.1 billion, the maximum an INTEGER can hold, and Twitter engineers had to execute an emergency migration to 64-bit BIGINT across billions of rows while the site stayed live. The incident, which contributed to the infamous fail whale outages, is now a classic case study in why data type choices made on day one follow a product for years. In this lesson you will learn the core SQL types, understand exactly what each one stores, and build the instinct to choose correctly before scale turns a small decision into a crisis.

Why data types matter

Daily Life
Interviews

Prevent errors from mismatched types

Data types are not just labels. They fundamentally determine how your data is stored, processed, and queried. The database uses your type choice to decide everything from how much disk space to use, to whether math operations are allowed.
Storage efficiency
Types determine how much disk space each value uses.
Data integrity
Only valid operations are allowed (math on numbers only).
Sort order
Numbers sort numerically; text sorts alphabetically.
Query performance
Indexes work best when types match the data.
SQL databases organize data into three fundamental categories. Every other type is a specialization of these:
INTEGERVARCHARBOOLEAN
INTEGER
Whole Numbers
Counts, IDs, quantities
VARCHAR
Text Strings
Names, emails, addresses
BOOLEAN
True or False
Flags and binary states

Type Mismatch Consequences

Choosing the wrong type leads to unexpected behavior in operations like arithmetic and sorting.

What happens when you store a number as VARCHAR instead of INTEGER?

Strings cannot do arithmetic. Most databases will raise an error or concatenate the values into "10050" instead of computing 150.
1SELECT
2 '100' + '50' AS total
3FROM orders

Arithmetic Proves the Type

One quick way to verify a column is stored as a number is to use it in arithmetic. If a column is really an integer, adding or subtracting will work correctly.

> Complete this query to calculate each product's price per unit by dividing total cost by quantity.

SELECT
  product,
  total_cost ___ ___ AS unit_price
FROM orders
+
quantity
/
product
The type system enforces what operations are legal. Storing a number as text disables arithmetic and changes sort order, so the database treats "9" as greater than "10" alphabetically.
Type mismatches are among the most common sources of data pipeline bugs. Catching them early at schema design time is far cheaper than debugging incorrect aggregations in production.
Understanding data types also helps you read query plans. A type mismatch on a join key forces an implicit conversion that prevents index usage and slows the query significantly.

INTEGER for whole numbers

Daily Life
Interviews

Store and compute with numeric data

An INTEGER stores whole numbers -- numbers without decimal points. Think of it like a counter that only shows complete units: 1, 2, 3, not 1.5.

Integer literals in SQL are written as bare numbers without quotes. SQL uses quotes to distinguish strings from numbers: 42 (no quotes) is a number, while '42' (with quotes) is text.
1SELECT
2 42 AS int_val,
3 - 17 AS neg_int,
4 42 + 8 AS math,
5 '42' AS str
6FROM users
Result
int_valneg_intmathstr
42-175042

Integer Applications

Integers are the most common type for IDs, counts, and any whole number data.
Common Integer Use Cases
  • Unique identifiers: user_id, order_id
  • Reference columns: customer_id in an orders table
  • Counts and quantities: stock_count, view_count
  • Years and ages: birth_year, age
  • Status codes: 1=pending, 2=complete
Understanding storage size helps you choose between integer types for performance-sensitive tables.
TIP
INTEGER: Always 4 bytes. Range: -2,147,483,648 to 2,147,483,647 (~2.1 billion). If you need larger values, use BIGINT (8 bytes, up to 9 quintillion).

> Complete the query to calculate total value by combining two numeric columns.

SELECT
  test_name,
  value ___ ___ AS total
FROM ab_results
'value'
+
result_id
||

INTEGER is the workhorse of SQL schemas. It supports the full range of arithmetic operations and sorts numerically, making it ideal for counts, ranks, and quantities.

BIGINT provides a larger range at the cost of double the storage. Use it when values can exceed 2.1 billion, such as globally unique identifiers or high-volume event counters.

Integer types are hardware-accelerated in modern CPUs. Operations on integers run faster than equivalent operations on strings or decimal types, which matters at analytical scale.

STRING types (VARCHAR)

Daily Life
Interviews

Handle text of varying lengths

VARCHAR stores text -- a sequence of characters (often called a "string"). Unlike integers which have a fixed size, text values vary in length: "Hi" is 2 characters while "Hello World" is 11.

String literals in SQL are enclosed in single quotes. Double quotes are reserved for identifiers (table and column names). This is a common source of confusion for beginners.
1SELECT
2 'Alice' AS name,
3 'It''s escaped' AS escaped
4FROM users
Result
nameescaped
AliceIt's escaped

Choosing String Types

Different string types suit different use cases based on length variability and storage needs.
VARCHAR(n)
  • Variable length up to n chars
  • Stores only actual content
  • Most common string type
  • Use for: names, emails
CHAR(n)
  • Fixed length, always n chars
  • Pads short values with spaces
  • Faster for fixed-width data
  • Use for: country codes (US)
TIP
UTF-8 encoding means international characters and emojis use 2-4 bytes each. VARCHAR(100) can store 100 ASCII characters but only 25-50 emojis. Plan for your actual character set.
One of the most common string operations is measuring how long a value is. Try it yourself.

> Complete this query to select each user's email with a friendlier column name.

SELECT
  username,
  ___ ___ contact_email
FROM users
email
address
AS
FROM

VARCHAR is the most common string type because it stores only the actual content, not a fixed allocation. This efficiency matters at scale where millions of rows with variable-length strings add up quickly.

Single quotes delimit string literals in SQL while double quotes delimit identifiers like table and column names. Mixing them up is one of the most frequent syntax errors for SQL beginners.

When sizing VARCHAR columns, audit your actual data first. Choosing VARCHAR(50) for a field that only ever holds 10 characters wastes both index space and makes comparisons slower.

BOOLEAN for true/false

Daily Life
Interviews

Represent yes/no decisions in data

A BOOLEAN represents logical truth values: TRUE or FALSE. Think of it like a light switch -- it can only be on or off, nothing in between.

TRUE and FALSE are reserved SQL keywords, not strings. You will learn to combine them with logical operators in later lessons.

1SELECT
2 TRUE AS true_val,
3 FALSE AS false_val
4FROM users
Result
true_valfalse_val
truefalse
truefalse
truefalse

Boolean Best Practices

Using BOOLEAN properly improves both storage efficiency and code clarity.

Do
  • Use BOOLEAN for true/false columns
  • Name boolean columns clearly: is_active, has_paid
  • Use BOOLEAN for yes/no flags instead of INTEGER 0/1
Don't
  • Store "true"/"false" as VARCHAR strings
  • Use INTEGER 0/1 when BOOLEAN is available
  • Forget that BOOLEAN columns can also be NULL
Choosing the right type also has real storage consequences at scale.

> Complete this query to select each user's ID and whether their account is active.

SELECT
  user_id,
  ___ AS ___
FROM users
is_paid
is_active
account_status
user_name

Boolean columns consume only 1 byte each. When a table has many flag columns, using BOOLEAN instead of VARCHAR or INTEGER significantly reduces both storage and join key sizes.

Name boolean columns with a clear prefix like is_, has_, or can_ so readers immediately know the column holds a true/false value without reading documentation.

SQL BOOLEAN can also be NULL, representing an unknown state. This three-valued logic (true, false, unknown) is important to handle correctly in WHERE clauses and aggregations.

CAST for type conversion

Daily Life
Interviews

Convert values between types safely

CAST explicitly converts values from one type to another. This is essential when data is stored in the wrong type or when you need to perform type-specific operations.

1SELECT
2 '123' AS original,
3 CAST('123' AS INTEGER) AS as_int,
4 CAST('123' AS INTEGER) + 77 AS total,
5 CAST(1 AS BOOLEAN) AS to_bool,
6 CAST('42.5' AS DOUBLE) AS to_real
7FROM users
Result
originalas_inttotalto_boolto_real
123123200142.5
TIP
CAST fails if conversion is impossible (e.g., CAST('abc' AS INTEGER). Use TRY_CAST for safe conversions that won't crash your query (covered in intermediate lesson).

Practice Conversions

Apply CAST to convert between common types in these practice exercises.

> Complete this query to convert the text-based price into a whole number.

SELECT
  ___(price AS ___)
FROM products
CAST
TRY_CAST
VARCHAR
INTEGER

CAST is explicit and safe when you know the data is valid. For untrusted data from external sources, TRY_CAST is preferable because it returns NULL on failure instead of crashing the query.

Type conversions happen at query time and have a cost. When the same CAST is needed repeatedly, consider storing the converted value in a dedicated column during the ETL process.

Mastering CAST and data types together means you can work confidently with data from any source, whether it arrives with the correct types or needs cleanup before analysis.

PUTTING IT ALL TOGETHER

> You are a data engineer at Stripe designing the first production schema for a payments platform. You must choose column types that store user records and transaction amounts accurately without expensive migrations as data volume scales.

INTEGER vs DECIMAL determines whether transaction amounts store exact cents or silently lose precision through floating-point rounding errors.
VARCHAR length limits on user_name and email columns prevent runaway storage costs while still accommodating the longest realistic inputs.
BOOLEAN columns for is_active and is_verified store binary flags more efficiently than VARCHAR yes/no fields, saving bytes per row at scale.
CAST converts imported string values from CSV onboarding files into proper numeric types so arithmetic on transaction totals works correctly.
KEY TAKEAWAYS
Data types determine storage efficiency, query performance, and data integrity
INTEGER is fastest for counting, IDs, and quantities (4 bytes, hardware-accelerated)
VARCHAR(n) is the standard string type; use CHAR(n) only for fixed-length codes like country codes
BOOLEAN stores TRUE/FALSE/NULL in 1 byte; use it for yes/no flags
CAST converts between types; use it when operations require specific types
Choosing the wrong type has real consequences: wrong sort order, wasted space, failed math
Always store numbers as numeric types and dates as date types, not strings

Data Types: Beginner

INT, VARCHAR, and the lies we tell

Category
SQL
Difficulty
beginner
Duration
19 minutes
Challenges
0 hands-on challenges

Topics covered: Why data types matter, INTEGER for whole numbers, STRING types (VARCHAR), BOOLEAN for true/false, CAST for type conversion

Lesson Sections

  1. Why data types matter (concepts: sqlBasicTypes)

    Data types are not just labels. They fundamentally determine how your data is stored, processed, and queried. The database uses your type choice to decide everything from how much disk space to use, to whether math operations are allowed. SQL databases organize data into three fundamental categories. Every other type is a specialization of these: Type Mismatch Consequences Choosing the wrong type leads to unexpected behavior in operations like arithmetic and sorting. Arithmetic Proves the Type T

  2. INTEGER for whole numbers

    Integer literals in SQL are written as bare numbers without quotes. SQL uses quotes to distinguish strings from numbers: 42 (no quotes) is a number, while '42' (with quotes) is text. Integer Applications Integers are the most common type for IDs, counts, and any whole number data. Understanding storage size helps you choose between integer types for performance-sensitive tables. Integer types are hardware-accelerated in modern CPUs. Operations on integers run faster than equivalent operations on

  3. STRING types (VARCHAR)

    String literals in SQL are enclosed in single quotes. Double quotes are reserved for identifiers (table and column names). This is a common source of confusion for beginners. Choosing String Types Different string types suit different use cases based on length variability and storage needs. One of the most common string operations is measuring how long a value is. Try it yourself. Single quotes delimit string literals in SQL while double quotes delimit identifiers like table and column names. Mi

  4. BOOLEAN for true/false

    Boolean Best Practices Choosing the right type also has real storage consequences at scale. Name boolean columns with a clear prefix like is_, has_, or can_ so readers immediately know the column holds a true/false value without reading documentation.

  5. CAST for type conversion (concepts: sqlCast)

    Practice Conversions