Why do indices start at 0 in programming?
2025-01-26
One of the most puzzling questions I had while studying algorithms was "Why does the first element of an array start at index 0 instead of index 1?". Let's say we have a list like this
days = ["Mon", "Tue", "Wed", "Thu", "Fri"]
The first element of this list is "Mon" and the second element is "Tue". But why should we use days[1] instead of days[2] to find Tue in this list?
This is because indexing starts at 0, not 1. In real life, if we saw a list like the one above and asked for the location of Tue, we would probably say 2 or second.
Curious as to why, I decided to look up the basic principles and math behind computer science, and here's what I found.
1. Historical background
Since the C language adopted zero-based indexing, most modern programming languages have followed suit, including Python, Java, and JavaScript.
This provides consistency that reduces confusion when programmers switch from one language to another.
2. The concept of Offset
The most important conceptual benefit of zero-based indexing is that it provides a natural representation of "offset".
days = ["월", "화", "수", "목", "금"]
↑ ↑ ↑ ↑ ↑
0 1 2 3 4
Each index represents a distance from the starting point.
- "month": 0 spaces away from the start point
- "Tue": 1 space away from the starting point
- "Wed": 2 spaces away from the starting point
3. Efficiency of memory address calculation
When the starting address of an array is A and the size of each element is S, we can use
- 0-based indexing: element's address = A + (i × S)
- 1-based indexing: element's address = A + ((i-1) × S)
Zero-based indexing is more efficient because it doesn't require an extra operation (-1).
4. Mathematical Benefits
Calculating the length of an interval
- With a starting index of 0, the length of the interval [0, N) is simply N
- Slicing is more intuitive (e.g., the first 3 elements are [0:3])
Simplified use of iterators
# 0 기반 인덱싱
for i in range(5): # 0,1,2,3,4
print(days[i])
# 1 기반이었다면
for i in range(1,6): # 1,2,3,4,5
print(days[i-1]) # 매번 -1 필요
Similar examples in real life
- Start at 0 cm when measuring with a ruler
- Digital clocks start at 0:00
- Number of basement floors in a building (1st basement, 2nd basement...)
Selection of other programming languages
While most programming languages use 0-based indexing, a few languages have chosen to use 1-based indexing.
% MATLAB 예시
array = [1, 2, 3, 4, 5];
first_element = array(1); % 1번째 요소
# R 예시
array <- c(1, 2, 3, 4, 5)
first_element <- array[1] # 1번째 요소
-- Lua 예시
array = {1, 2, 3, 4, 5}
first_element = array[1] -- 1번째 요소
These languages chose 1-based indexing for the following reasons
- **User-based
- MATLAB: primarily used by mathematicians and scientists, who are comfortable with the mathematical notation of starting matrices at 1.
-
R: A language for statisticians, which also emphasizes mathematical intuitiveness.
-
**Specificity of use
- Languages specialized in a particular field, such as scientific computation or statistical analysis, were more concerned with following the conventions of that field.
-
Prioritized user convenience over consistency with general programming languages
-
Design philosophy
- "It should be easier for humans to understand than computers" philosophy
- Choosing to prioritize human intuition over technical efficiency
As you can see, not all languages believe that zero-based indexing is the only answer. It's a matter of choosing the right one for the right purpose and audience.
0-based vs 1-based in math
Even in math, the starting point is different depending on the context.
- **When starting from 1
- The set of natural numbers (1, 2, 3, ...)
- Row and column numbers in a matrix
-
Ordinal numbers representing order (1st, 2nd, 3rd...)
-
**starting from 0
- Set of integers (..., -2, -1, 0, 1, 2, ...)
- Starting in the coordinate system (0,0)
- When measuring distance or displacement
- Exponents (x⁰, x¹, x², ...)
- Degree of a polynomial
In many areas of math, especially those related to computer science, a zero base is more natural.
- Graph theory in discrete mathematics
- Counting the number of cases in combinatorics
- Event spaces in probability theory
So you can't simplify it to "math intuition = 1 base". The choice of 1-based by MATLAB or R is purely based on the conventions of certain areas of mathematics (matrix computation, statistics) that are favored by their primary user base.
Human intuition vs. computer efficiency
One of the biggest concerns in programming language design is the balance between "human intuitiveness" and "computer efficiency". Zero-based indexing is a good example of this tradeoff.
Price of giving up intuitiveness
It's no wonder that people learning to program for the first time have a hard time adjusting to zero-based indexing. We've been counting "first" as 1 all our lives. However, this small sacrifice of intuition has had the following benefits
- more efficient memory access
- simpler range calculations
- fewer CPU operations
The essence of programming
At the end of the day, programming is the process of translating human thoughts into a way that computers can understand. In this process
sometimes humans must adopt the computer's way of thinking, and sometimes the computer must adopt the human way of thinking.
Zero-based indexing is a good example of the former. It's inconvenient at first, but once you understand it, you realize it's a more logical and consistent system.
How modern programming languages solve the problem
Modern programming languages try to solve this problem in their own way. Take Python, for example.
# Python의 경우
days = ["월", "화", "수", "목", "금"]
first = days[0] # 0 기반 인덱싱 사용
last = days[-1] # 직관적인 음수 인덱싱 제공
third = days[2] # 컴퓨터 친화적
It keeps the basics computer-friendly, but adds human-friendly features like negative indexing and slicing.
Conclusion
The question that initially prompted me to write this article was "Why should we sacrifice human intuitiveness for computer efficiency?". I'm a Python developer, but the design philosophies of the 1-based indexing languages I found resonated with me.
But as I dug deeper, I realized something. At its core, programming is all about translating human thought into a way that computers can understand. What we do as developers is talk to computers, and sometimes we need to learn their language.
Zero-based indexing can feel uncomfortable and counterintuitive at first. We've been counting "first" as 1 all our lives, but this small sacrifice of intuition has the benefit of more efficient memory access, simpler range calculations, and fewer CPU operations.
Now it makes sense. Zero-based indexing isn't just a convention or a legacy of the C language, it's a logical choice that fits well with the fundamental principles of computer science, and it's a choice that has shaped the evolution of modern programming.
At the end of the day, a good programming language is about finding the right balance between human intuitiveness and computer efficiency, and zero-based indexing was one answer to the search for that balance.
Kakao
Google
Naver