-
Notifications
You must be signed in to change notification settings - Fork 196
Coding Advice and Standards
This page contains coding advice and coding standards for our main developers, students and anyone who is kind enough to contribute. Additions to this page are welcome.
Good software to have:
- http://brew.sh makes it easy to install software
- http://hub.github.com nice command line interface for git
This is probably the most important thing. It is good to get things implemented quickly, but it is essential that they be implemented correctly in nicely structured code.
You might not know it, but bugs make a lot of people waste a lot of time and tarnish our reputation. When people people hit a bug, it disrupts their work, and we look bad. Then, they complain, usually to me. This makes me waste time in email, apologies, "we will look into it" emails. Then, we (typically me) have to open up a JIRA issue in our tracking system, assemble all the files to be able to replicate the bug. Sometimes this is already hard. It often takes me one hour to do this. Then someone has to fix it, which requires replicating the bug, and often requires understanding someone else's code. Then, someone else has to verify that it works correctly, this is typically me. Many emails or comments fly around, it's a big time sink. A "simple" bug causes many hours of extra work for many people and for me, and not least, for you, as you need to track it down.
Take your time to think about the corner cases and write robust correct code. If you think some situation will "never happen" you are wrong. It will happen, and sometimes it happens at the worst possible time, when you are demonstrating to a sponsor.
The only acceptable number of warnings in our code is zero! Warnings are symptoms of bad code, so they should always be investigated and eliminated. We are about to embark in several significant refactorings of Karma, and it much easier to refactor code when there are no warnings to start with.
Please look at the code you have checked in and make sure that it does not produce any warnings in Eclipse. Please review the parts of your code that issue warnings and improve the code is that no warning is produced. Adding @SuppressWarnings to your code is NOT ACCEPTABLE. I sampled some of the warnings and they are all resolvable. For example, unused variables can be commented out and insert //TODO: to get back to them.
Use the default Eclipse code indentation so that everyone indents t2) he code the same way. This is important because if you do Command-Shift-F in a file, which re-indents it, we don't want it to change the file because you are using a different indentation style. Otherwise, the source comparison features of Git may report irrelevant changes and make merging conflicts significantly harder.
If you find that your class is over 500 lines or so, it will be very difficult for other people to understand it and maintain it. Your class has grown to do too many things and it ought to be broken into smaller better designed classes. Ideally, a class should only be a few pages of code (there are exceptions), and no method should be over a page long. Once a class is too large, it is likely that it encapsulates too much functionality and it is hard for others to understand it. Methods over a page long are simply unreadable. The best code is not the one that executes the fastest. It is the one that others can understand.
Declarations like these private ArrayList<ArrayList<Vector>> msegs; are way too complicated. This occurs in a file that has already gone over the size limit. It is likely that this class has all kinds of methods to set up and manipulate this complicated data structure. If broken into classes, each of the classes will be significantly simpler, and others will understand what is represented in this complex structure.
This makes it easier for people to know who to ask when they don't understand a class. If you change a class significantly add your name to the top.
If you find yourself copy pasting code to do something new, then stop. It is time to refactor. Create a super-class, a helper function, a util class, something that makes sense and captures the common code in one place. Copy pasted code is hard to maintain, requires finding all the places where a bug or bad feature was copy pasted to, and fixing all of them. Copy/pasting is fast, but a terrible thing to do as you create headaches for everyone else. Never do it!
If a method is declared as private, never ever change it to public because you want to call it. There was a good reason it was made private, and unless you fully understand what you are doing you are probably creating a bug right there. Same thing goes for final variables, it is a worse sin to remove final declarations.
All other methods should be made private. Keep your public APIs small and simple.
This is subtle but super important. For example, make it so that your constructors or factory methods create "ready-to-go-objects" that don't require callers to initialize them in particular ways by calling other methods. People don't know they have to do that and end up causing bugs. For example, an empty constructor is bad unless it is "ready-to-go", ie, I can start using it safely without calling set methods or initialize methods on it.
There is a simple reason for this. One of our sponsors checks for print statements and we get penalized for doing it. Besides, those messages that are so important to you are annoying and distracting to everyone else. Use the logging facilities. Don't even use them in debugging and plan to delete them before you commit. You will forget. I don't want to see even commented out print statements.
We are preparing the code to run on the cloud, and creating files in your commands breaks things. You need to figure out a different way to save state. We will have a mechanism to save state, but creating a file somewhere in the disk or source tree is not allowed.
Code needs to evolve, and people need to change existing code. Make it easy for them. Remember that other people will change "your" code, so you might as well make it easy for them to do so correctly.
Users will do all kinds of strange things. Make sure that your GUIs are robust to all bad things that users can do, explain to them what is expected, and never let users do anything that results in them inadvertently losing their work. This makes them very unhappy and makes us unpopular with users. We are trying to become popular with users.
In a class that needs logging, a static logger should be created using LoggerFactory with the the full hierarchical name of the enclosing class. Using this hierarchical naming convention allows us to take advantage of hierarchical loggers. It effectively lets us selectively enable different logging behaviors for different packages including logging to files in addition to the console, enable debugging for a single class, or disable logging for given loggers all together. If a logger is created using only the simple name for a class, we lose this hierarchy and could potential suffer from logging collisions. Also, make sure you are only creating loggers using the SLF4J interface and not any particular implementation like Log4j.
Correct way to get a logger
private static Logger logger = LoggerFactory.getLogger(SubmitCleaningCommand.class);
Wrong way to get a logger
private static Logger logger = Logger.getLogger(GetDataPropertiesForClassCommand.class.getSimpleName());
Commented out code should be removed. If it was committed at any point and is then needed in the future, it'll still be in the git history. Debugging code should not be committed either unless it's in the form of logging statements. Debugging code should not run during regular program execution.
Main methods should be used only for specific purposes. If there's a main method in something like edu.isi.karma.cleaning.features.RecordClassifier2, chances are it shouldn't be there. Sometimes they seem to be relied on as a means for testing functionality. If tests are needed, they should be codified in the form of unit tests, preferably JUnit tests.