๐Ÿ Python Course

Strings

1. Strings

A Python string is an immutable sequence of Unicode characters.

Quick Reference: String methods

MethodPurposeReturnsComplexityNotes
lower()Convert to lowercaseNew stringO(n)Original string unchanged
upper()Convert to uppercaseNew stringO(n)Original string unchanged
casefold()Aggressive lowercase for comparisonsNew stringO(n)Better for case-insensitive matching
strip()Remove whitespace from both endsNew stringO(n)Use lstrip() / rstrip() for one side
lstrip()Remove whitespace from left sideNew stringO(n)Useful for parsing
rstrip()Remove whitespace from right sideNew stringO(n)Common for file lines
replace(old, new)Replace substringNew stringO(n)Replaces all occurrences by default
split(sep)Split into listlist[str]O(n)Common for parsing text
splitlines()Split multiline text into lineslist[str]O(n)Very useful for file/text processing
join(iter)Join iterable into stringNew stringO(n)Very common interview question
find(sub)Find first occurrenceintO(n)Returns -1 if not found
rfind(sub)Find last occurrenceintO(n)Returns -1 if not found
index(sub)Find first occurrenceintO(n)Raises ValueError if not found
rindex(sub)Find last occurrenceintO(n)Raises ValueError if not found
count(sub)Count occurrencesintO(n)Counts substring matches
startswith(prefix)Check prefixboolO(k)k = prefix length
endswith(suffix)Check suffixboolO(k)Common in path checks
partition(sep)Split once at first separatortupleO(n)Returns (before, sep, after)
rpartition(sep)Split once at last separatortupleO(n)Useful for paths/domains
removeprefix(x)Remove prefix if presentNew stringO(k)Cleaner than slicing
removesuffix(x)Remove suffix if presentNew stringO(k)Great for extensions
isdigit()Check if all chars are digitsboolO(n)Useful for validation
isnumeric()Check if all chars are numericboolO(n)Broader than isdigit()
isalpha()Check if all chars are lettersboolO(n)Useful for validation
isalnum()Check letters/digits onlyboolO(n)Common input checks
isspace()Check whitespace onlyboolO(n)Useful in parsing
islower()Check lowercaseboolO(n)Validation/helper
isupper()Check uppercaseboolO(n)Validation/helper
format(...)String formattingNew stringO(n)Older alternative to f-strings

Quick Reference: Common string operations

OperationPurposeReturnsComplexityNotes
s[i]Get character by indexCharacterO(1)Strings are immutable
s[a:b]Slice substringNew stringO(k)k = slice size
s[::-1]Reverse copyNew stringO(n)Common interview trick
len(s)String lengthintO(1)Very common
x in sSubstring / char checkboolO(n)Membership test
s1 + s2Concatenate stringsNew stringO(n + m)Creates new string
s * nRepeat stringNew stringO(nยทk)Common for patterns
for ch in sIterate charactersIterator behaviorO(n) totalVery common
enumerate(s)Iterate index + charIteratorO(n) totalCommon in interviews
ord(ch)Character to Unicode code pointintO(1)Useful in algorithms
chr(i)Code point to characterCharacterO(1)Reverse of ord()

Example:

s = "python"

This means two important things.

  • It is ordered: Each character has a position, or index.
s[0]   # 'p'
s[1]   # 'y'
  • It is immutable: You cannot change characters in place.
s[0] = "P"

=> TypeError: 'str' object does not support item assignment

If you โ€œmodifyโ€ a string, Python creates a new string object.

This is extremely important to know.

Example:

s = "python"
s = s.upper()

The original string is not mutated, a new string in uppercase is created and then the s variable gets rebound to this new string.


f-strings

f-strings (formatted string literals) allow you to embed expressions directly inside string literals using curly braces {}. The f prefix tells Python to evaluate the contents of the braces.

Basic syntax

name = "Ruben"
age = 30
msg = f"My name is {name} and I am {age}"   # 'My name is Ruben and I am 30'

The variables name and age are evaluated and inserted into the string.

Expressions inside f-strings

You can include any Python expression inside the braces:

f"{2 + 3}"                # '5'
f"{len('hello')}"         # '5'
f"{name.upper()}"         # 'RUBEN'
f"{age * 2}"              # '60'

Formatting numbers

You can format values using a colon : inside the braces:

price = 12.3456
f"{price:.2f}"            # '12.35' (2 decimal places)
f"{price:.1f}"            # '12.3' (1 decimal place)
f"{1000000:,}"            # '1,000,000' (comma separator)

Why f-strings matter

f-strings are the modern, preferred way to format strings in Python 3.6+. They are:

  • Readable: variables are visible directly in the string
  • Fast: faster than older methods like .format() or % formatting
  • Flexible: any expression can go inside the braces

Common in production code for logging, formatting output, and building messages.


Indexing and slicing

Basic indexing

s = "python"

s[0]   # 'p'
s[1]   # 'y'
s[-1]  # 'n'
s[-2]  # 'o'

Negative indices count from the end.

This is a very common Python idiom.

Slicing syntax

General syntax:

s[start:stop:step]

Where:

  • start = inclusive
  • stop = exclusive
  • step = jump size

Example:

s = "python"

s[0:3]   # 'pyt'
s[:3]    # 'pyt'
s[3:]    # 'hon'
s[::2]   # 'pto'

Reversing strings

Pythonic way

s = "python"
reversed_s = s[::-1]   # 'nohtyp'

This uses slicing with step = -1

Alternative: reversed()

''.join(reversed(s))

Important distinction

reversed() returns an iterator, not a string.

So join() is needed to create the string.

Interview reasoning

A strong answer should mention:

  • slicing is concise and idiomatic
  • reversed() is explicit
  • both are O(n)

Definitions:

  • Concise: it expresses the same logic using fewer characters and less code while remaining clear.
  • Idiomatic: it follows the style and conventions that experienced Python developers naturally use and expect to see.
  • Explicit: it makes the intended action more directly visible in the code, so the reader immediately understands what is happening.

Important string methods

lower() / upper()

Convert all characters to lowercase or uppercase.

s = "Hello World"
s.lower()    # 'hello world'
s.upper()    # 'HELLO WORLD'

Used constantly in normalization โ€” converting data into a consistent, standard format for reliable processing.

Realistic backend example:

email = email.strip().lower()

This normalizes email input by removing whitespace and standardizing case (emails are case-insensitive in practice).

split()

This is extremely important for backend work.

Used constantly in:

  • parsing input
  • CSV-like data
  • API payload cleanup
  • tokenization

Basic usage

text = "python interview prep"
words = text.split()   # ['python', 'interview', 'prep']

By default it splits on whitespace.

Custom separator

csv = "a,b,c"
parts = csv.split(",")   # ['a', 'b', 'c']

Important subtlety

"a  b".split()   # ['a', 'b']

split() splits on any whitespace and collapses repeated whitespace.

But:

"a  b".split(" ")   # ['a', '', 'b']

split(" ") splits on exactly one space character and preserves empty fields.

This distinction is interview-worthy.


join()

join() combines multiple strings from an iterable into a single string. The string you call join() on (" ") becomes the glue between elements.

Examples

" ".join(["python", "interview"])   # 'python interview'
"-".join(["a", "b", "c"])           # 'a-b-c'
"".join(["a", "b", "c"])            # 'abc' (no separator)
" | ".join(["x", "y", "z"])         # 'x | y | z'

Why join() matters

A very common interview question is:

Why use join() instead of repeated +?

Because repeated concatenation creates many temporary strings.

Bad:

result = ""
for word in words:
    result += word

Better (more efficient):

result = "".join(words)

strip()

strip() removes whitespace (or specified characters) from both the beginning and end of a string.

Basic usage

s = "  hello  "
s.strip()   # 'hello'

Variants

s.lstrip()   # "hello  " (removes from left only)
s.rstrip()   # "  hello" (removes from right only)

Custom characters

Instead of whitespace, you can remove any characters from the edges:

"---hello---".strip("-")   # 'hello'

Important note

strip() removes characters from the edges only, not from inside the string:

"  hel  lo  ".strip()   # 'hel  lo' (inside spaces untouched)

replace()

replace() finds all occurrences of a substring and replaces them with another string, returning a new string.

Basic syntax

s = "hello world"
s.replace("world", "python")   # 'hello python'

More examples

"banana".replace("a", "o")     # 'bonono' (all occurrences)
"hello hello".replace("hello", "hi")  # 'hi hi'

Limiting replacements

You can limit how many replacements to make:

"banana".replace("a", "o", 1)  # 'bonana' (only first occurrence)
"banana".replace("a", "o", 2)  # 'bonona' (only first 2 occurrences)

Important: immutability

replace() returns a new string. The original string is unchanged:

s = "hello"
s.replace("h", "H")  # Returns 'Hello', but s is still 'hello'
s = s.replace("h", "H")  # Now s is 'Hello' (must reassign)

startswith() / endswith()

Check if a string begins or ends with a specific substring. Returns True or False.

filename = "document.json"
filename.endswith(".json")        # True
filename.endswith(".txt")         # False

url = "https://example.com"
url.startswith("https")           # True

Very common in validation and routing.

find() vs index()

Both search for a substring and return its position.

find() returns -1 if not found:

s = "hello world"
s.find("world")     # 6
s.find("xyz")       # -1 (not found)

index() raises a ValueError if not found:

s = "hello world"
s.index("world")    # 6
s.index("xyz")      # ValueError: substring not found

Use find() when you want a safe "not found" value. Use index() when you expect the substring to exist and want an error if it doesn't.

count()

Count how many times a substring appears in a string.

s = "banana"
s.count("a")        # 3
s.count("na")       # 2

s = "hello hello"
s.count("hello")    # 2

Useful for validation and analysis tasks.

in operator (membership testing)

Check if a substring exists anywhere in a string. Very efficient for simple existence checks.

"world" in "hello world"    # True
"xyz" in "hello world"      # False

if "@" in email:
    # valid email format

Preferred over find() when you just need True/False.

Type checking methods

These methods check if all characters in the string are of a specific type.

"123".isdigit()         # True (all characters are digits 0-9)
"12.3".isdigit()        # False (contains a dot)
"ยฝ".isdigit()           # False (not a basic digit)

"123".isnumeric()       # True (all characters are numeric)
"ยฝ".isnumeric()         # True (fractions are numeric)
"12.3".isnumeric()      # False (dot is not numeric)
"-5".isnumeric()        # False (minus sign is not numeric)

"abc".isalpha()         # True (all characters are alphabetic)
"abc123".isalpha()      # False (contains digits)
"abc123".isalnum()      # True (all characters are alphanumeric)
"abc 123".isalnum()     # False (contains a space)
"  ".isspace()          # True (all characters are whitespace)
"HELLO".isupper()       # True (all letters are uppercase)
"hello".islower()       # True (all letters are lowercase)

Key difference: isdigit() vs isnumeric()

  • isdigit(): Only basic digits 0-9
  • isnumeric(): Includes fractions, superscripts, and other numeric Unicode characters
  • Both fail for decimal points and negative signs

Checking if a string is a valid number (including floats)

Type checking methods can't validate actual numbers with decimals or negative signs. Use try/except to safely convert:

def is_number(s: str) -> bool:
    try:
        float(s)  # Accepts integers, floats, negatives, scientific notation
        return True
    except ValueError:
        return False

is_number("123")        # True
is_number("12.5")       # True
is_number("-42")        # True
is_number("3.14e-2")    # True (scientific notation)
is_number("abc")        # False
is_number("12.3.4")     # False

This is the robust, Pythonic way to validate numeric input in real code.


Verbal interview questions

Answer these out loud:

  1. Why are strings immutable?
  2. Why is join() preferred over repeated +?
  3. Explain why the slicing stop index is exclusive.
  4. What is the difference between split() and split(" ")?
  5. Why does replace() not mutate the original string?

Coding drills

Drill 1: palindrome

def is_palindrome(s: str) -> bool:
    ...

Drill 2: normalize email

def normalize_email(email: str) -> str:
    ...

Drill 3: reverse words

def reverse_words(text: str) -> str:
    ...

1.1.10 Common interview problems

Strings are often tested through problems like:

Problem 1: Palindrome Check

Question: Given a string, determine if it is a palindrome (reads the same forwards and backwards). Ignore spaces, punctuation, and case.

def isPalindrome(s: str) -> bool:
    # "A man, a plan, a canal: Panama" โ†’ True
    # "race a car" โ†’ False
    # "0P" โ†’ False
    ...

Visual: "racecar" โ†’ reverse is "racecar" โ†’ True

Key insight: Remove non-alphanumeric characters, convert to lowercase, compare with reverse.


Problem 2: Reverse Words in String

Question: Given a string, reverse the order of words (not characters).

def reverseWords(s: str) -> str:
    # "the sky is blue" โ†’ "blue is sky the"
    # "  hello world  " โ†’ "world hello"
    # Words are separated by single space; strip leading/trailing spaces
    ...

Visual: ["the", "sky", "is", "blue"] โ†’ ["blue", "is", "sky", "the"] โ†’ join with space

Key insight: Split on whitespace, eliminate empty strings, reverse, and rejoin.


Problem 3: Anagram Check

Question: Given two strings, determine if they are anagrams (contain the same characters in different order).

def isAnagram(s: str, t: str) -> bool:
    # s = "anagram", t = "nagaram" โ†’ True
    # s = "rat", t = "car" โ†’ False
    # Assume lowercase letters only
    ...

Visual: "anagram" has characters [a, n, a, g, r, a, m] and "nagaram" has same characters โ†’ True

Key insight: Sort both strings and compare, or use character frequency counting with Counter.

from collections import Counter
Counter(s) == Counter(t)  # True if anagrams

Problem 4: First Unique Character

Question: Given a string, find the index of the first character that appears only once. Return -1 if no such character exists.

def firstUniqChar(s: str) -> int:
    # s = "leetcode" โ†’ 0 ('l' appears once at index 0)
    # s = "loveleetcode" โ†’ 2 ('v' is first unique)
    # s = "aabb" โ†’ -1 (no unique chars)
    ...

Visual: "leetcode" โ†’ 'l' appears once, at index 0 โ†’ return 0

Key insight: Count character frequencies, then iterate through string to find first with count == 1.

from collections import Counter
counts = Counter(s)
for i, char in enumerate(s):
    if counts[char] == 1:
        return i
return -1

Problem 5: Longest Substring Without Repeating Characters

Question: Given a string, find the length of the longest substring without repeating characters.

def lengthOfLongestSubstring(s: str) -> int:
    # s = "abcabcbb" โ†’ 3 ("abc")
    # s = "bbbbb" โ†’ 1 ("b")
    # s = "pwwkew" โ†’ 3 ("wke")
    ...

Visual: "abcabcbb" โ†’ sliding window finds "abc" with length 3

Key insight: Sliding window with a set or dictionary tracking character positions. Move left pointer when duplicate found.

char_index = {}
max_len = 0
left = 0

for right, char in enumerate(s):
    if char in char_index and char_index[char] >= left:
        left = char_index[char] + 1
    char_index[char] = right
    max_len = max(max_len, right - left + 1)

return max_len

Complexity: O(n) with sliding window.


Problem 6: Valid Parentheses

Question: Given a string containing just the characters '(', ')', '{', '}', '[' and ']', determine if the input string is valid (each closing bracket has a matching opening bracket of the same type, in correct order).

def isValid(s: str) -> bool:
    # "()" โ†’ True
    # "()[]{}" โ†’ True
    # "(]" โ†’ False (mismatched)
    # "([{}])" โ†’ True (nested)
    # "([)]" โ†’ False (incorrect order)
    ...

Visual: Process left-to-right:

  • "([)]" โ†’ see (, push to stack
  • See [, push to stack
  • See ), pop [ from stack โ†’ mismatch! โ†’ False

Key insight: Use a stack (list) to match brackets. Push opening brackets, pop and verify match on closing brackets.

stack = []
pairs = {'(': ')', '[': ']', '{': '}'}

for char in s:
    if char in pairs:
        stack.append(char)
    else:
        if not stack or pairs[stack.pop()] != char:
            return False

return len(stack) == 0

Complexity: O(n) time, O(n) space for stack.


Problem 7: Group Anagrams

Question: Given a list of strings, group anagrams together.

def groupAnagrams(strs: list[str]) -> list[list[str]]:
    # strs = ["eat", "tea", "ate", "nat", "tan", "bat"]
    # Return [["eat", "tea", "ate"], ["nat", "tan"], ["bat"]]
    # (or any grouping order)
    ...

Visual: Sort characters in each word:

  • "eat" โ†’ "aet"
  • "tea" โ†’ "aet" (same key!)
  • "ate" โ†’ "aet" (same key!)

Group by sorted key.

Key insight: Sort characters within each word as a grouping key. Use a dictionary with sorted string as key.

from collections import defaultdict

groups = defaultdict(list)
for word in strs:
    key = ''.join(sorted(word))
    groups[key].append(word)

return list(groups.values())

Problem 8: Longest Common Prefix

Question: Given a list of strings, find the longest string that is a prefix of all of them.

def longestCommonPrefix(strs: list[str]) -> str:
    # strs = ["flower", "flow", "flight"] โ†’ "fl"
    # strs = ["dog", "racecar", "car"] โ†’ ""
    # strs = ["interspecies", "interstellar"] โ†’ "inters"
    ...

Visual: Compare characters position by position across all strings:

  • Position 0: all have "f" โ†’ keep
  • Position 1: all have "l" โ†’ keep
  • Position 2: "o", "o", "i" โ†’ stop

Result: "fl"

Key insight: Iterate through character positions. Stop when any string runs out or characters don't match.

if not strs:
    return ""

for i in range(len(strs[0])):
    char = strs[0][i]
    for j in range(1, len(strs)):
        if i >= len(strs[j]) or strs[j][i] != char:
            return strs[0][:i]

return strs[0]

Problem 9: Implement strStr() (KMP Pattern Matching)

Question: Implement a function that finds the index of the first occurrence of a needle in a haystack. Return -1 if not found (like find()).

def strStr(haystack: str, needle: str) -> int:
    # haystack = "sadbutsad", needle = "sad" โ†’ 0
    # haystack = "leetcode", needle = "leeto" โ†’ -1
    # needle = "" โ†’ 0 (empty string matches at start)
    ...

Simple approach: Check every position.

for i in range(len(haystack) - len(needle) + 1):
    if haystack[i:i+len(needle)] == needle:
        return i
return -1

Key insight: Can use string's built-in find() method, or implement KMP algorithm for O(n+m) instead of O(n*m).

Production context: haystack.find(needle) is simpler.


Problem 10: Word Ladder

Question: Given two words (start and end) and a dictionary, find the shortest transformation sequence from start to end such that each intermediate word exists in the dictionary and differs by exactly one letter.

def ladderLength(beginWord: str, endWord: str, wordList: list[str]) -> int:
    # beginWord = "hit", endWord = "cog"
    # wordList = ["hot", "dot", "dog", "lot", "log", "cog"]
    # "hit" โ†’ "hot" โ†’ "dot" โ†’ "dog" โ†’ "cog"
    # Return length = 5
    ...

Visual: Graph problem where edges connect words differing by one letter. Use BFS to find shortest path.

Key insight: BFS with pattern matching. For each word, generate all possible one-letter variations and check dictionary.

from collections import deque

if endWord not in wordList:
    return 0

word_set = set(wordList)
queue = deque([(beginWord, 1)])

while queue:
    word, length = queue.popleft()
    
    if word == endWord:
        return length
    
    for i in range(len(word)):
        for c in 'abcdefghijklmnopqrstuvwxyz':
            if c != word[i]:
                new_word = word[:i] + c + word[i+1:]
                if new_word in word_set:
                    word_set.remove(new_word)
                    queue.append((new_word, length + 1))

return 0

Complexity: O(n * l * 26) where n = words in dictionary, l = word length.