Arguments against Hungarian Notation

Kimbro Staken plug-devel@lists.PLUG.phoenix.az.us
Tue May 7 01:00:21 2002


On Sunday, May 5, 2002, at 10:12 AM, Rob Wehrli wrote:
>
> Don't be a fool.  The VB programmers at Microsoft (and many other
>

Well wow, I'm so glad that I didn't get flamed or something. Nice, really 
nice. Guess I should have stated that I've used both VB and hungarian 
notation.

> places!) aren't stupid monkeys and just started using it "without really
> understanding what it was for."  I don't mind poking fun at VB and even
>
And how is that implying that VB programmers are stupid monkeys? It is a 
simple statement and since I myself at various times have done a bit of VB 
work, then if I meant it to be truly derogatory then I would be 
criticizing myself. You read too much into an innocent statement, 
especially one that is really true. Very few people carefully consider the 
techniques that they use, this applies to VB programmers just as much as 
it does to Java programmers or assembly language programmers. In reality 
all people on some level or in some activity simply follow rather then 
consider why something is done. Unfortunately, in the case of VB you have 
a vendor in Microsoft who provides the leadership and everybody else is 
just going to follow. Sun of course does the same for Java.

Hungarian notation was designed at a time when the programming model for 
DOS/Windows was even more complex then it is now. At the time the standard 
windows programming language was C and not even C++ was really used and 
Visual Basic was just a twinkle in Charles Cooper's eye. It was needed 
because you had a lot of variables, a lot of pointers and a really crappy 
memory model that made you keep track of different types of pointers. In 
that environment it was very useful to know if something was a long 
pointer to a null terminated string or just a pointer to a byte buffer.

In a language like VB or Java you don't need this extra syntax. You have a 
flat memory model and no pointers. You also make extensive use of 
components that encapsulate many things that would have otherwise been 
variables. This is especially true in Java as there is no such thing as a 
global variable, so your variables live within a very limited scope.

In addition you now have vastly superior development tools that can 
provide the needed insight into the code without adding extra syntax into 
the code. This is especially true in VB where there is only one IDE that 
is used.

Hungarian notation had a time and a place, and it was valuable in that 
time and place. However, technology and  language design has progressed 
since then and in my opinion it is no longer of any value in Java and is 
in fact damaging to the overall quality and understandability of the code.

> a few other languages, but its programmers are not some version of
> mutated pygmies from Mars.  They are intelligent humans with strengths
> and capacities that enable them to write useful applications for
> Windows.  Charlie Simonya "invented" Hungarian notation...or at least,
>

His name is Charles Simonyi and he originally wrote this up as part of his 
doctoral thesis. I believe he was still with PARC at the time, it wasn't 
until he joined Microsoft that it was widely used.

> his use of it and his Hungarian ancestry represent our collective
> understanding of the combined words and their meanings.
>

Another reason it was called hungarian notation is because people thought 
it made the code look like it was written in hungarian. That is a very bad 
thing.

>
>> conventions for Java. [1] It is also pretty useless in a pure OO language
>> where you don't have pointers, it really just obfuscates the code.
>
> Huh?
>
> public class BadCode
> {
>   public	String 		a = "Using Hungarian Makes Sense."
>   public	boolean		b = false;
>   public	Boolean		B = new Boolean( false );
>   public	int		i = 0;
>   public	Integer		I = new Integer( 0 );
>   public	char		c = 'c';
>   public	Character 	C = new Character( c );
>   public	double		d = 12345678.12345;
>   public	Double		D = new Double( 12345678.12345 );
>   public	float		f = 123456.7890F;
>   public	Float		F = new Float( 123456.7890 );
>
>   public BadCode()
>   {}
> }
>
> versus
>
> String 		strName 	= "Charles Simonya";
> boolean 	bIsRaining 	= false;
> int		iCounter	= 0;


Huh your self? You're arguing for hungarian notation vs. an example which 
is simply poor programming? What makes you think that if a programmer is 
going to name their variables with one letter that they're going to name 
them any better if hungarian notation is brought into play? If you force 
them to use hungarian you're not magically going to get nicely named 
variables you're simply going to get variables with more letters in them. 
And even worse if someone can't even get a basic principal like 
descriptive variable names right, what makes you think they'll be able to 
apply hungarian notation consistently and correctly?

> I don't see how the Hungraian "obfuscates" the code...I can see how it
> would dramatically HELP the code...

A more appropriate comparison would be to this code.

String 	name 	= "Charles Simonyi";
boolean 	isRaining 	= false;
int		counter	= 0;

Your example actually clearly illustrates one of the problems with the 
notation. When I looked at the variable bIsRaining I had to carefully 
consider what the variable was actually named and what was the notation. 
My first thought was that it was named Raining and the notation was bIs 
where the capital i looks exactly like a lower case L. It took me several 
seconds to realize that the actual variable name was isRaining.

It is this need to consider the meaning of the prefix that is what causes 
harm with hungarian notation. For the programmer who wrote the code 
originally it may be OK, but for other people it presents a serious 
distraction away from the core meaning of the problem being solved and 
into a focus on the mechanics of the programming language. The whole 
reason we use higher level languages is to move away from focusing on the 
mechanics required by the machine and into a focus on the problem domain. 
Anything you add that forces your focus back to the mechanics, in places 
where it is unnecessary, is a step backwards.

Your example also points out why it shouldn't be needed when you use 
proper variable names. Well chosen names will imply their types, as the 
names you chose do when stripped of their prefixes.

By using hungarian notation, you're basically making the statement that 
"type is the most important aspect for a variable". I would argue that 
"meaning is the most important aspect for a variable and that type is a 
very secondary consideration". I believe that if you want to improve the 
understandability of your code then focus on choosing better variable 
names, not on adding unnecessary type information to names.

> Note this class (fragment) that I wrote for a simple racing "stats"
> thing for one of the local drag racing clubs where I participate.
>
> public final class Car
> {
> 	long lPrimaryKey;
> 	int iRacesThisSeason;
> 	int iPointsThisSeason;
> 	int iMatchesThisSeason;
> 	int iMatchPointsThisSeason;
> 	String strNumber;
> 	String strOwner;
> 	String strYear;
> 	String strModel;
> 	String strColor;
> 	String strMotor;
> 	String strInduction;
> 	String strNitrous;
> 	boolean bLicensed;
> 	String strBarCage;
> 	String strTireType;
> 	String strTireWidth;
> 	String strNotes;
> 	String strLocation;
>
> ...
> }
>
> What is so obfuscating about that?

It adds syntax that is unnecessary and forces you to consider the type of 
a variable before considering what the variable is for. This gets in the 
way of understanding the intention of the code by forcing you to first 
consider the mechanics of the code. Even in code that I wrote my self I 
found this to be a problem when I used hungarian notation.

My argument is from the psychological perspective of understanding the 
code, not from the perspective of adding as much information as possible. 
I believe that the added information simply gets in the way and causes a 
series of minor distractions that in the end harm productivity. My 
argument is probably irrelevant if you are the only person who sees your 
code, my concern is only team environments. This was the domain for the 
original question.

>
> How can it possibly be obfuscation?
>
> IListenerBroadcast
> ITerminalOutput
>

Because it provides irrelevant information and changes the meaning of the 
name to be logically incorrect.

Let's take your Car class as an example. You obviously designed that class 
with one particular type of car in mind. So in your code you declare an 
instance of the class as something like.

Car winningCar;

Now what happens if you decide later that a car could be either a real car 
or it could be a matchbox car and the implementation of the actual car 
class will be completely different. You would then convert Car into an 
interface and add classes for RealCar and MatchboxCar that implement that 
interface. If you follow your naming convention you would also have to 
rename Car to ICar just because it became an interface. So you then have 
to change your declarations to.

ICar winningCar;

What is the logical reason for reflecting this change in the actual syntax 
of the program? Seems to me the logical way is to keep the interface named 
Car because after all it "is" still a car. The fact that in actual 
programming language implementation it is an interface is completely 
irrelevant. It obfuscates the code by forcing you to consider the question 
"is the fact that this is an interface an important piece of information?" 
when in the vast majority of cases it will not be. It is simply a Car and 
interface or not doesn't make one bit of difference. So again by using 
this naming convention you're placing an emphasis on something that 
provides more information, but in doing so forces you to consider 
something that is irrelevant to actually solving the problem.

>
> ...see any difference?  I'm thinking that perhaps your "obfuscation"
> comment is slightly exaggerated based on your bias toward NOT seeing HN?
>
Your use of obfuscation is an exaggeration, of course it is also a common 
one in the programming world. Doesn't change the english definition of the 
word though, it simply means to confuse. Obfuscation is obfuscation your 
example just takes it to an extreme.

>
> Also, there is something to be said about being flexible enough to
> understand BOTH sides of the topic.  I can understand not wanting to
> "convert" to HN, but I don't see the need to justify it as "obfuscating"
> the code.  The goal of HN is to make the code CLEARER, and it does,
> whether or not you've accustomed yourself to accepting it or not.  I
> even use HN in naming my XML attributes...where the types must always be
> the same within the "code" (xml file) but handled differently in the
> parser.  I think that there are certainly different strokes for
> different folks, but this is not a religious thing.  If you following
>
Ahh, so you're arguing that the notation can be used to make bad code 
easier to understand. That is probably true, but why bother? If you can't 
get people to listen to feedback on code quality, how is hungarian 
notation really going to help? Sure saying that if HK was used the code 
would be easier to understand is nice, but it's no different then saying 
if the code had just been written properly that it would be easier to 
understand. My concern is improving real code quality, in that scenario 
hungarian notation is nothing more then a distraction. In my opinion 
you're fighting the right war (code quality) but on the wrong battleground.

>
>
> I don't ask that you or anyone else use HN if it doesn't match your
> "local" conventions/standards.  I do, however, ask that if you must use
> a global, at least indicate it using HN.

I don't disagree with you here, but then Java doesn't have global 
variables either. :-)

>
Kimbro Staken
Java and XML Software, Consulting and Writing http://www.xmldatabases.org/
Apache Xindice native XML database http://xml.apache.org/xindice
XML:DB Initiative http://www.xmldb.org