Strings
1. Strings
A Python string is an immutable sequence of Unicode characters.
Quick Reference: String methods
| Method | Purpose | Returns | Complexity | Notes |
|---|---|---|---|---|
lower() | Convert to lowercase | New string | O(n) | Original string unchanged |
upper() | Convert to uppercase | New string | O(n) | Original string unchanged |
casefold() | Aggressive lowercase for comparisons | New string | O(n) | Better for case-insensitive matching |
strip() | Remove whitespace from both ends | New string | O(n) | Use lstrip() / rstrip() for one side |
lstrip() | Remove whitespace from left side | New string | O(n) | Useful for parsing |
rstrip() | Remove whitespace from right side | New string | O(n) | Common for file lines |
replace(old, new) | Replace substring | New string | O(n) | Replaces all occurrences by default |
split(sep) | Split into list | list[str] | O(n) | Common for parsing text |
splitlines() | Split multiline text into lines | list[str] | O(n) | Very useful for file/text processing |
join(iter) | Join iterable into string | New string | O(n) | Very common interview question |
find(sub) | Find first occurrence | int | O(n) | Returns -1 if not found |
rfind(sub) | Find last occurrence | int | O(n) | Returns -1 if not found |
index(sub) | Find first occurrence | int | O(n) | Raises ValueError if not found |
rindex(sub) | Find last occurrence | int | O(n) | Raises ValueError if not found |
count(sub) | Count occurrences | int | O(n) | Counts substring matches |
startswith(prefix) | Check prefix | bool | O(k) | k = prefix length |
endswith(suffix) | Check suffix | bool | O(k) | Common in path checks |
partition(sep) | Split once at first separator | tuple | O(n) | Returns (before, sep, after) |
rpartition(sep) | Split once at last separator | tuple | O(n) | Useful for paths/domains |
removeprefix(x) | Remove prefix if present | New string | O(k) | Cleaner than slicing |
removesuffix(x) | Remove suffix if present | New string | O(k) | Great for extensions |
isdigit() | Check if all chars are digits | bool | O(n) | Useful for validation |
isnumeric() | Check if all chars are numeric | bool | O(n) | Broader than isdigit() |
isalpha() | Check if all chars are letters | bool | O(n) | Useful for validation |
isalnum() | Check letters/digits only | bool | O(n) | Common input checks |
isspace() | Check whitespace only | bool | O(n) | Useful in parsing |
islower() | Check lowercase | bool | O(n) | Validation/helper |
isupper() | Check uppercase | bool | O(n) | Validation/helper |
format(...) | String formatting | New string | O(n) | Older alternative to f-strings |
Quick Reference: Common string operations
| Operation | Purpose | Returns | Complexity | Notes |
|---|---|---|---|---|
s[i] | Get character by index | Character | O(1) | Strings are immutable |
s[a:b] | Slice substring | New string | O(k) | k = slice size |
s[::-1] | Reverse copy | New string | O(n) | Common interview trick |
len(s) | String length | int | O(1) | Very common |
x in s | Substring / char check | bool | O(n) | Membership test |
s1 + s2 | Concatenate strings | New string | O(n + m) | Creates new string |
s * n | Repeat string | New string | O(nยทk) | Common for patterns |
for ch in s | Iterate characters | Iterator behavior | O(n) total | Very common |
enumerate(s) | Iterate index + char | Iterator | O(n) total | Common in interviews |
ord(ch) | Character to Unicode code point | int | O(1) | Useful in algorithms |
chr(i) | Code point to character | Character | O(1) | Reverse of ord() |
Example:
s = "python"
This means two important things.
- It is ordered: Each character has a position, or index.
s[0] # 'p'
s[1] # 'y'
- It is immutable: You cannot change characters in place.
s[0] = "P"
=> TypeError: 'str' object does not support item assignment
If you โmodifyโ a string, Python creates a new string object.
This is extremely important to know.
Example:
s = "python"
s = s.upper()
The original string is not mutated, a new string in uppercase is created and then the s variable gets rebound to this new string.
f-strings
f-strings (formatted string literals) allow you to embed expressions directly inside string literals using curly braces {}. The f prefix tells Python to evaluate the contents of the braces.
Basic syntax
name = "Ruben"
age = 30
msg = f"My name is {name} and I am {age}" # 'My name is Ruben and I am 30'
The variables name and age are evaluated and inserted into the string.
Expressions inside f-strings
You can include any Python expression inside the braces:
f"{2 + 3}" # '5'
f"{len('hello')}" # '5'
f"{name.upper()}" # 'RUBEN'
f"{age * 2}" # '60'
Formatting numbers
You can format values using a colon : inside the braces:
price = 12.3456
f"{price:.2f}" # '12.35' (2 decimal places)
f"{price:.1f}" # '12.3' (1 decimal place)
f"{1000000:,}" # '1,000,000' (comma separator)
Why f-strings matter
f-strings are the modern, preferred way to format strings in Python 3.6+. They are:
- Readable: variables are visible directly in the string
- Fast: faster than older methods like
.format()or%formatting - Flexible: any expression can go inside the braces
Common in production code for logging, formatting output, and building messages.
Indexing and slicing
Basic indexing
s = "python"
s[0] # 'p'
s[1] # 'y'
s[-1] # 'n'
s[-2] # 'o'
Negative indices count from the end.
This is a very common Python idiom.
Slicing syntax
General syntax:
s[start:stop:step]
Where:
start= inclusivestop= exclusivestep= jump size
Example:
s = "python"
s[0:3] # 'pyt'
s[:3] # 'pyt'
s[3:] # 'hon'
s[::2] # 'pto'
Reversing strings
Pythonic way
s = "python"
reversed_s = s[::-1] # 'nohtyp'
This uses slicing with step = -1
Alternative: reversed()
''.join(reversed(s))
Important distinction
reversed() returns an iterator, not a string.
So join() is needed to create the string.
Interview reasoning
A strong answer should mention:
- slicing is concise and idiomatic
reversed()is explicit- both are
O(n)
Definitions:
- Concise: it expresses the same logic using fewer characters and less code while remaining clear.
- Idiomatic: it follows the style and conventions that experienced Python developers naturally use and expect to see.
- Explicit: it makes the intended action more directly visible in the code, so the reader immediately understands what is happening.
Important string methods
lower() / upper()
Convert all characters to lowercase or uppercase.
s = "Hello World"
s.lower() # 'hello world'
s.upper() # 'HELLO WORLD'
Used constantly in normalization โ converting data into a consistent, standard format for reliable processing.
Realistic backend example:
email = email.strip().lower()
This normalizes email input by removing whitespace and standardizing case (emails are case-insensitive in practice).
split()
This is extremely important for backend work.
Used constantly in:
- parsing input
- CSV-like data
- API payload cleanup
- tokenization
Basic usage
text = "python interview prep"
words = text.split() # ['python', 'interview', 'prep']
By default it splits on whitespace.
Custom separator
csv = "a,b,c"
parts = csv.split(",") # ['a', 'b', 'c']
Important subtlety
"a b".split() # ['a', 'b']
split() splits on any whitespace and collapses repeated whitespace.
But:
"a b".split(" ") # ['a', '', 'b']
split(" ") splits on exactly one space character and preserves empty fields.
This distinction is interview-worthy.
join()
join() combines multiple strings from an iterable into a single string. The string you call join() on (" ") becomes the glue between elements.
Examples
" ".join(["python", "interview"]) # 'python interview'
"-".join(["a", "b", "c"]) # 'a-b-c'
"".join(["a", "b", "c"]) # 'abc' (no separator)
" | ".join(["x", "y", "z"]) # 'x | y | z'
Why join() matters
A very common interview question is:
Why use join() instead of repeated +?
Because repeated concatenation creates many temporary strings.
Bad:
result = ""
for word in words:
result += word
Better (more efficient):
result = "".join(words)
strip()
strip() removes whitespace (or specified characters) from both the beginning and end of a string.
Basic usage
s = " hello "
s.strip() # 'hello'
Variants
s.lstrip() # "hello " (removes from left only)
s.rstrip() # " hello" (removes from right only)
Custom characters
Instead of whitespace, you can remove any characters from the edges:
"---hello---".strip("-") # 'hello'
Important note
strip() removes characters from the edges only, not from inside the string:
" hel lo ".strip() # 'hel lo' (inside spaces untouched)
replace()
replace() finds all occurrences of a substring and replaces them with another string, returning a new string.
Basic syntax
s = "hello world"
s.replace("world", "python") # 'hello python'
More examples
"banana".replace("a", "o") # 'bonono' (all occurrences)
"hello hello".replace("hello", "hi") # 'hi hi'
Limiting replacements
You can limit how many replacements to make:
"banana".replace("a", "o", 1) # 'bonana' (only first occurrence)
"banana".replace("a", "o", 2) # 'bonona' (only first 2 occurrences)
Important: immutability
replace() returns a new string. The original string is unchanged:
s = "hello"
s.replace("h", "H") # Returns 'Hello', but s is still 'hello'
s = s.replace("h", "H") # Now s is 'Hello' (must reassign)
startswith() / endswith()
Check if a string begins or ends with a specific substring. Returns True or False.
filename = "document.json"
filename.endswith(".json") # True
filename.endswith(".txt") # False
url = "https://example.com"
url.startswith("https") # True
Very common in validation and routing.
find() vs index()
Both search for a substring and return its position.
find() returns -1 if not found:
s = "hello world"
s.find("world") # 6
s.find("xyz") # -1 (not found)
index() raises a ValueError if not found:
s = "hello world"
s.index("world") # 6
s.index("xyz") # ValueError: substring not found
Use find() when you want a safe "not found" value. Use index() when you expect the substring to exist and want an error if it doesn't.
count()
Count how many times a substring appears in a string.
s = "banana"
s.count("a") # 3
s.count("na") # 2
s = "hello hello"
s.count("hello") # 2
Useful for validation and analysis tasks.
in operator (membership testing)
Check if a substring exists anywhere in a string. Very efficient for simple existence checks.
"world" in "hello world" # True
"xyz" in "hello world" # False
if "@" in email:
# valid email format
Preferred over find() when you just need True/False.
Type checking methods
These methods check if all characters in the string are of a specific type.
"123".isdigit() # True (all characters are digits 0-9)
"12.3".isdigit() # False (contains a dot)
"ยฝ".isdigit() # False (not a basic digit)
"123".isnumeric() # True (all characters are numeric)
"ยฝ".isnumeric() # True (fractions are numeric)
"12.3".isnumeric() # False (dot is not numeric)
"-5".isnumeric() # False (minus sign is not numeric)
"abc".isalpha() # True (all characters are alphabetic)
"abc123".isalpha() # False (contains digits)
"abc123".isalnum() # True (all characters are alphanumeric)
"abc 123".isalnum() # False (contains a space)
" ".isspace() # True (all characters are whitespace)
"HELLO".isupper() # True (all letters are uppercase)
"hello".islower() # True (all letters are lowercase)
Key difference: isdigit() vs isnumeric()
isdigit(): Only basic digits 0-9isnumeric(): Includes fractions, superscripts, and other numeric Unicode characters- Both fail for decimal points and negative signs
Checking if a string is a valid number (including floats)
Type checking methods can't validate actual numbers with decimals or negative signs. Use try/except to safely convert:
def is_number(s: str) -> bool:
try:
float(s) # Accepts integers, floats, negatives, scientific notation
return True
except ValueError:
return False
is_number("123") # True
is_number("12.5") # True
is_number("-42") # True
is_number("3.14e-2") # True (scientific notation)
is_number("abc") # False
is_number("12.3.4") # False
This is the robust, Pythonic way to validate numeric input in real code.
Verbal interview questions
Answer these out loud:
- Why are strings immutable?
- Why is
join()preferred over repeated+? - Explain why the slicing stop index is exclusive.
- What is the difference between
split()andsplit(" ")? - Why does
replace()not mutate the original string?
Coding drills
Drill 1: palindrome
def is_palindrome(s: str) -> bool:
...
Drill 2: normalize email
def normalize_email(email: str) -> str:
...
Drill 3: reverse words
def reverse_words(text: str) -> str:
...
1.1.10 Common interview problems
Strings are often tested through problems like:
Problem 1: Palindrome Check
Question: Given a string, determine if it is a palindrome (reads the same forwards and backwards). Ignore spaces, punctuation, and case.
def isPalindrome(s: str) -> bool:
# "A man, a plan, a canal: Panama" โ True
# "race a car" โ False
# "0P" โ False
...
Visual: "racecar" โ reverse is "racecar" โ True
Key insight: Remove non-alphanumeric characters, convert to lowercase, compare with reverse.
Problem 2: Reverse Words in String
Question: Given a string, reverse the order of words (not characters).
def reverseWords(s: str) -> str:
# "the sky is blue" โ "blue is sky the"
# " hello world " โ "world hello"
# Words are separated by single space; strip leading/trailing spaces
...
Visual: ["the", "sky", "is", "blue"] โ ["blue", "is", "sky", "the"] โ join with space
Key insight: Split on whitespace, eliminate empty strings, reverse, and rejoin.
Problem 3: Anagram Check
Question: Given two strings, determine if they are anagrams (contain the same characters in different order).
def isAnagram(s: str, t: str) -> bool:
# s = "anagram", t = "nagaram" โ True
# s = "rat", t = "car" โ False
# Assume lowercase letters only
...
Visual: "anagram" has characters [a, n, a, g, r, a, m] and "nagaram" has same characters โ True
Key insight: Sort both strings and compare, or use character frequency counting with Counter.
from collections import Counter
Counter(s) == Counter(t) # True if anagrams
Problem 4: First Unique Character
Question: Given a string, find the index of the first character that appears only once. Return -1 if no such character exists.
def firstUniqChar(s: str) -> int:
# s = "leetcode" โ 0 ('l' appears once at index 0)
# s = "loveleetcode" โ 2 ('v' is first unique)
# s = "aabb" โ -1 (no unique chars)
...
Visual: "leetcode" โ 'l' appears once, at index 0 โ return 0
Key insight: Count character frequencies, then iterate through string to find first with count == 1.
from collections import Counter
counts = Counter(s)
for i, char in enumerate(s):
if counts[char] == 1:
return i
return -1
Problem 5: Longest Substring Without Repeating Characters
Question: Given a string, find the length of the longest substring without repeating characters.
def lengthOfLongestSubstring(s: str) -> int:
# s = "abcabcbb" โ 3 ("abc")
# s = "bbbbb" โ 1 ("b")
# s = "pwwkew" โ 3 ("wke")
...
Visual: "abcabcbb" โ sliding window finds "abc" with length 3
Key insight: Sliding window with a set or dictionary tracking character positions. Move left pointer when duplicate found.
char_index = {}
max_len = 0
left = 0
for right, char in enumerate(s):
if char in char_index and char_index[char] >= left:
left = char_index[char] + 1
char_index[char] = right
max_len = max(max_len, right - left + 1)
return max_len
Complexity: O(n) with sliding window.
Problem 6: Valid Parentheses
Question: Given a string containing just the characters '(', ')', '{', '}', '[' and ']', determine if the input string is valid (each closing bracket has a matching opening bracket of the same type, in correct order).
def isValid(s: str) -> bool:
# "()" โ True
# "()[]{}" โ True
# "(]" โ False (mismatched)
# "([{}])" โ True (nested)
# "([)]" โ False (incorrect order)
...
Visual: Process left-to-right:
"([)]"โ see(, push to stack- See
[, push to stack - See
), pop[from stack โ mismatch! โ False
Key insight: Use a stack (list) to match brackets. Push opening brackets, pop and verify match on closing brackets.
stack = []
pairs = {'(': ')', '[': ']', '{': '}'}
for char in s:
if char in pairs:
stack.append(char)
else:
if not stack or pairs[stack.pop()] != char:
return False
return len(stack) == 0
Complexity: O(n) time, O(n) space for stack.
Problem 7: Group Anagrams
Question: Given a list of strings, group anagrams together.
def groupAnagrams(strs: list[str]) -> list[list[str]]:
# strs = ["eat", "tea", "ate", "nat", "tan", "bat"]
# Return [["eat", "tea", "ate"], ["nat", "tan"], ["bat"]]
# (or any grouping order)
...
Visual: Sort characters in each word:
"eat"โ"aet""tea"โ"aet"(same key!)"ate"โ"aet"(same key!)
Group by sorted key.
Key insight: Sort characters within each word as a grouping key. Use a dictionary with sorted string as key.
from collections import defaultdict
groups = defaultdict(list)
for word in strs:
key = ''.join(sorted(word))
groups[key].append(word)
return list(groups.values())
Problem 8: Longest Common Prefix
Question: Given a list of strings, find the longest string that is a prefix of all of them.
def longestCommonPrefix(strs: list[str]) -> str:
# strs = ["flower", "flow", "flight"] โ "fl"
# strs = ["dog", "racecar", "car"] โ ""
# strs = ["interspecies", "interstellar"] โ "inters"
...
Visual: Compare characters position by position across all strings:
- Position 0: all have
"f"โ keep - Position 1: all have
"l"โ keep - Position 2:
"o","o","i"โ stop
Result: "fl"
Key insight: Iterate through character positions. Stop when any string runs out or characters don't match.
if not strs:
return ""
for i in range(len(strs[0])):
char = strs[0][i]
for j in range(1, len(strs)):
if i >= len(strs[j]) or strs[j][i] != char:
return strs[0][:i]
return strs[0]
Problem 9: Implement strStr() (KMP Pattern Matching)
Question: Implement a function that finds the index of the first occurrence of a needle in a haystack. Return -1 if not found (like find()).
def strStr(haystack: str, needle: str) -> int:
# haystack = "sadbutsad", needle = "sad" โ 0
# haystack = "leetcode", needle = "leeto" โ -1
# needle = "" โ 0 (empty string matches at start)
...
Simple approach: Check every position.
for i in range(len(haystack) - len(needle) + 1):
if haystack[i:i+len(needle)] == needle:
return i
return -1
Key insight: Can use string's built-in find() method, or implement KMP algorithm for O(n+m) instead of O(n*m).
Production context: haystack.find(needle) is simpler.
Problem 10: Word Ladder
Question: Given two words (start and end) and a dictionary, find the shortest transformation sequence from start to end such that each intermediate word exists in the dictionary and differs by exactly one letter.
def ladderLength(beginWord: str, endWord: str, wordList: list[str]) -> int:
# beginWord = "hit", endWord = "cog"
# wordList = ["hot", "dot", "dog", "lot", "log", "cog"]
# "hit" โ "hot" โ "dot" โ "dog" โ "cog"
# Return length = 5
...
Visual: Graph problem where edges connect words differing by one letter. Use BFS to find shortest path.
Key insight: BFS with pattern matching. For each word, generate all possible one-letter variations and check dictionary.
from collections import deque
if endWord not in wordList:
return 0
word_set = set(wordList)
queue = deque([(beginWord, 1)])
while queue:
word, length = queue.popleft()
if word == endWord:
return length
for i in range(len(word)):
for c in 'abcdefghijklmnopqrstuvwxyz':
if c != word[i]:
new_word = word[:i] + c + word[i+1:]
if new_word in word_set:
word_set.remove(new_word)
queue.append((new_word, length + 1))
return 0
Complexity: O(n * l * 26) where n = words in dictionary, l = word length.