Clean Code: The Art of Clean Naming

Mar 06, 2024

One of the hardest parts about coding is “naming” stuff.

When done correctly, anyone can read your code.

When done poorly, others may spend significant amounts of time trying to figure out what is going on.

The art of clean naming, and yes it is an art, is something that you can practice anytime you code. It is something I make intentional when I code. There are times when naming comes naturally to me. Then there are times when I have no idea when I should name something so I call it x.

It’s when I start naming things x instead of what it is which is employee, customer, or student. Let’s dive in and see examples of the best naming conventions in code.

Clarity is Key

How does one show clarity? Isn’t this super obvious when trying to name something?

For example, let us think of the days of a week. There are 7 in total. Therefore we could have code like this:

week = [‘Monday’, ‘Tuesday’, ‘Wednesday’, ‘Thursday’, ‘Friday’, ‘Saturday’, ‘Sunday’] 

day = week[2]

Looking at this you may be following along thinking this is obvious by default. However, this is not the case. The day could mean anything. Is this a day of the year? Day as in daytime (opposite of night)? The week could also be confusing when you are just looking at the day variable naming convention.

We could enhance this by saying:

days_of_the_week = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'] 

selected_day = days_of_the_week[2]

Now this is getting better, days_of_the_week makes more sense than just a week and selected_day makes more sense than just a day.

However, when looking at selected_day, what does the 2 symbolize? Is the 2 an index? In this exact case, it is, but we start the list on a Monday. Could this mess things up in the future? Could we make this even more specific for our use case?

MONDAY = 0 
TUESDAY = 1 
WEDNESDAY = 2 
THURSDAY = 3 
FRIDAY = 4 
SATURDAY = 5 
SUNDAY = 6 

days_of_the_week = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'] 

selected_day = days_of_the_week[WEDNESDAY]

We could make this even more readable by making the day's enumerations, but I think you get the point. Regardless of the starting point of this code, you know immediately that the selected_day is “Wednesday”.

Provide context when needed

Providing context in variable names is essential for making your code self-explanatory. It helps other developers (and yourself in the future) understand the purpose and scope of a variable without needing to trace its usage across the codebase. Contextual information in a variable name clarifies the role of the variable within a larger structure or process.

Why Context Matters

Without proper context, developers might misinterpret the variable's purpose or scope, leading to potential bugs or misuse of the variables. Contextual naming is particularly crucial in larger codebases, where variables can interact across different modules or functions, and ambiguity can lead to significant confusion.

Example Explained

Let’s say we have code like:

first_name = "John" 
last_name = "Doe"

In this snippet, first_name and last_name are too generic. In a simple or small application, this may be able to get away. If this code is part of a larger system dealing with multiple entities, such as customers, employees, and users, it's not clear whom these names refer to.

If we went ahead to point out what may be wrong with the above we could find that:

Ambiguity: Without context, developers might assume these names refer to the currently logged-in user, an employee record, or other entities.
Maintenance Difficulty: If someone else or even you revisit the code later, you'll have to deduce the context from the usage, which can be time-consuming and error-prone.

To make this easier to read, simply add context to each variable:

customer_first_name = "Eric"
customer_last_name = "Roby"

Now we immediately know who the first and last names are for. They are for the customer! By prefixing first_name and last_name with customer_, the variables' purposes become clear:

Clarity: It's evident that these variables store the first and last names of a customer, not any other entity.
Ease of Maintenance: When you or other developers revisit the code, it will be immediately clear what these variables represent, reducing the cognitive load and potential for confusion.
Better Collaboration: In a team setting, clear, contextual naming helps team members quickly understand and collaborate on the code, even if they're seeing a particular part for the first time.

Avoid Magic Numbers:

The concept of "magic numbers" in programming refers to the use of hard-coded numbers in your code without an explanatory context or name. These numbers are called "magic" because their meaning isn't clear without additional context, making the code harder to understand and maintain. Replacing magic numbers with named constants not only clarifies their purpose but also simplifies future modifications and increases the code's readability.

There are a few reasons why you should avoid magic numbers, but the biggest would be for readability.

Let’s say we have this code:

if age >= 21:
    # Do something

It is not clear what 21 stands for. Why does age have to be greater than 21 to do something? Also age of what, a person?

In the United States, the drinking age is 21. I know this differs from country to country but I want to make sure we are all on the same page about the code coming up.

legal_drinking_age = 21
if user_age >= legal_drinking_age:
    # Cheers!

There is 0 confusion with this code. The legal drinking age is 21, and when the user_age is greater than or equal to 21 they can drink. Makes sense and is easy to read.

In the improved version:

Self-explanatory Code: The variable legal_drinking_age indicates that 21 is the age threshold for legal drinking. It acts as self-documenting code, making comments unnecessary for explaining the number's significance.
Easy to Update: If the legal drinking age changes, you only need to update the legal_drinking_age constant, and the change reflects wherever it's used.
Reduces Errors: By using a named constant, you reduce the risk of typos or incorrect updates. Changing one occurrence of 21 to 18 but missing others could introduce bugs. With a named constant, this risk is mitigated.

Avoiding abbreviations and acronyms

Avoiding abbreviations and acronyms in your code unless they are widely understood is a key principle in writing clean, readable, and maintainable code. Abbreviations and acronyms can significantly obscure your code's meaning and make it more difficult for others (and yourself in the future) to understand, especially if they are not standard across the domain or programming community.

Why Avoid Abbreviations and Acronyms

Clarity: Non-standard abbreviations or acronyms can lead to confusion or misinterpretation. For example, acc could mean "account," "accumulator," or "accelerator," depending on the context.
Readability: Full names make your code more readable and self-explanatory. Readers don't have to pause to decipher what an abbreviation stands for or look it up elsewhere.
International Context: In global teams, an abbreviation that makes sense in one language or culture might be unclear or mean something different in another.
Consistency: Different developers might use different abbreviations for the same terms, leading to inconsistent naming across the codebase.

Detailed Example Analysis

Without Abbreviations

account_balance

Explicit: The term account_balance leaves no room for ambiguity; it indicates that the variable represents the balance of an account.

With Abbreviations

acc_bal

Ambiguous: Here, acc could stand for "account," but it could also mean "accumulator" or any other term starting with "acc." Similarly, bal could mean "balance," but it might also stand for "ballistic," "balcony," etc., in different contexts.

Best Practices for Naming

Use Full Names: Always opt for full descriptive names over abbreviations. For example, use customer_address instead of cust_addr.
Common Acronyms are Okay: Widely recognized acronyms and abbreviations that are standard in the industry or programming language can be used. For instance, HTTP for Hypertext Transfer Protocol is universally recognized.
Document Non-Obvious Abbreviations: If you must use an abbreviation, ensure it's documented close to its first use or in a common documentation area if it's used widely across the codebase.
Consistent Abbreviations: If abbreviations are necessary and no standard exists, define a standard within your project and use it consistently.
Code Reviews: Utilize code reviews to catch and discuss non-standard abbreviations, ensuring team consensus on naming conventions.

By prioritizing clear and descriptive naming, you enhance the understandability and maintainability of your code, making it more accessible to current and future developers, including those who might not share your context or background.

Cheers,

Eric

Foersom

Mar 10, 2024

first_name and last_name are bad variable naming. First name in one language may have a different meaning in another language, e.g. in French the family name is written first (usually in all capital letters). So a better variable naming would be: given_name and family_name, to make clear what is what.

Good naming of variables and function are important for readable source code. To help me find the right name I often use synonym dictionaries to get ideas for better naming. My favorite online synonym dictionary is WordHippo.

Another aspect of naming is using opposite words for related start-stop functions: Open / Close, Start / Stop, Begin / End, Acquire / Release, Enter / Exit

Expand full comment

1 reply

Akos Komuves

Mar 7, 2024

This is a great write-up Eric! Naming is an important code-quality metric that should deserve more attention.

In my ideal world, I always name variables based on what they hold. So for example of I’m always selecting Wednesdays, I’d do this:

wednesday = week[2]

3 more comments...

Brain Bytes

Discussion about this post