Catarang in 2016 – Building a CI Server

Catarang is a Continuous Integration (CI) server that I’ve been working on for the past two years in my spare time and have made a ton of progress on in 2016. CI servers, for those who don’t know, do repetitive tasks like compiling code every time a change is checked in or automating pushes to production servers. The most popular one out there is Jenkins, but there are a ton of other options listed out on this excellent Wikipedia page: Comparison of continuous integration software. In this post, I’ve written up the progress I’ve made on Catarang in the past year and what I’m looking to do with it in the future.

Why build a new CI Server?

A question I often get when I talk about Catarang to people is why build a new one if there are a bunch of options out there already? The two main reasons are that none of the CI Servers out there fulfill all my needs and I wanted a project to learn Go on. I’d heard good things about the language and had tried out the little playground they had, but I learn best when I actively use the language on a project. Throughout the two years I’ve been working on Catarang, I’ve fallen in love with Go and find it an utter joy to program in. Once I got a hang of the language (which was very quick), I’ve been able to get a surprising amount done with it with a limited amount of head scratching issues coming up.

I’ve also learned a ton about CI servers and unique issues surrounding how they work. These things are incredibly complicated and take a lot of fiddling to get right! Lastly, it’s been fun to build a large scale tool by myself and be able to rework all of the code at the drop of a hat if I don’t like how it’s structured. I try to keep that to a minimum and make constant forward progress, but at times it’s beneficial to take a step backward and to the side so that it’s easier to go forward in the future.

Progress in 2016

Catarang Contributions for 2016

The amount of energy I had to work on Catarang fluctuated quite a bit this year due to work, illnesses, life issues, and politics. There were several weeks where I worked on it non-stop in my free time and then there were several months where I didn’t have any energy to even think about it.

This makes sense considering I’m only working on my own time and not being paid for it, but there were certainly days when I felt guilty about not programming on it when I had wanted to. Catarang as a whole has only 105 commits, so with 81 of them coming in the 2016, it’s easy to see that I did a significant chunk of work on the project this year.

Feature Set

Catarang is still in its infancy, so I wouldn’t suggest anyone start using it yet, but I am proud of how the features are coming together and it’s starting to feel like a real project. Here are all of the features I worked on this year:

Plugin System

I started off building all of the functionality straight into Catarang for git, running arbitrary commands, and saving off artifacts, but realized it’d be better to have a defined interface to make writing plugins easier. I’ve gone through several iterations and am at a middle point with this, but am happy with where I am right now. Each plugin is completely segregated from the rest of the code and I eventually plan on moving them outside of the project itself so that they can be updated independent from Catarang itself. I briefly looked at Hashicorp’s go-plugin and even got a prototype working, but stopped going down that route as it felt like I was going a bit too deep into that system and ignoring other necessary features. Right now plugins are as easy to add as filling out a basic interface and adding a single line to the plugin list to register it.

Job Template and Instantiation

I created a job template so that you can create a job via a single file that you keep in your depot. Catarang will pull in the appropriate plugins from the plugin system and run the commands that you specify in the template. Catarang can already handle when you check in a change to the config file and will rebuild the job based off of the new template, which took a bit of work to allow. This is the number one feature that I had originally wanted in a self-hosted CI server that wasn’t available in the one I was using at work, so it’s nice to have made really good progress on it. The template itself is pretty simple and this is what Catarang’s currently looks like:

Catarang’s build config file

It’s a bit verbose since it’s in JSON and that will likely be changed at a later point in time, but it’s a great start.

Job Instances

Each time a job runs, it creates a new instance of itself and saves off all of the log files associated with running that job. Each instance is separate because if you change a job’s configuration template, you’ll get very different output and want to keep that information around. It’s also useful to go back and see why certain runs of the job failed and why some succeeded. This feature has changed quite a bit over the past year due to creating the job template and the plugin system, but it’s at a pretty good place right now.

Unique Logging of Commands

I’m pretty proud of how Catarang logs all of the commands it runs. For every command it keeps

  1. A list of the arguments used to run the command
  2. A high level description of what’s being run and the plugin that ran the command
  3. The output from the command segmented into the Standard Out and Standard Error sections

Because of that last part, I can highlight on the CI server (and in emails) which parts actually failed so you don’t have to go digging for it when it does! I haven’t seen any other CI server do this, so it felt good to get in. It was a bit difficult to get in as I learned there are some deep, dark secrets of StdOut and StdErr that I didn’t know before. Did you know you can get interwoven lines of output in your console that look like this:

Yellow is a standardRed is an error that comes between a single output line single line of output
Yikes! It gets even worse if you have sub-commands being run in parallel and only a single output window.

A Web Server

One of the awesome parts of Go is that it’s suuuuuper easy to set up a web server and serve content. The most basic setup is a single line of code, and a more complex example isn’t far from that. I know next to nothing about web development, so I have a very simple web interface for Catarang that allows you to add new jobs, run them, delete them, clean them, and see all of the output from each job instance. This is the largest thing that will have to be worked on before I release it to the world, but it’s also the least important until I get a solid feature set up and running. I used to have fancy websockets working so the site would live-update, but that broke at some point and I didn’t care enough to go back and fix it since so much of the architecture of the program was changing.

What’s Next?

In 2017 I’m going to work much more on making Catarang stable and usable for the general public. It’d be neat to be able to release a very alpha version and start getting feedback on it, but that’s a bit of a stretch goal.

As for the next features I’m going to be implementing, a lot of it will focus on expanding out the Job’s capabilities. I’d like to add the ability to chain multiple jobs together in a pipeline using if/or/and blocks. Getting this right is the most important thing for a CI to be useful to users, so it will likely be quite a bit of iteration before I’m happy with it.

I’d also like to start expanding the plugins to other things people might find useful like Slack notifications and emails. There’s a long way to go, but if I implement some of the basics that most people are going to want, then I’ll get some traction in order for people to want to create their own to flesh out the parts they want but I haven’t made yet.

Tests are something I have a small amount of, but not nearly enough. I haven’t bothered with them since I’m more focused on building out a prototype, but now that it’s getting a bit larger they’re becoming much more useful. I’m going to be aiming for a fairly high test coverage percentage and we’ll see where I get with that.

Lastly, I’d like to start fleshing out the UI on the server into something that’s not a programmer’s prototype. I’m not very good at web development mostly due to lack of experience, so this either entails me finding someone that wants to work on this project with me (for free) or spending a lot of time learning how to build a highly interactive website from scratch. Both seem fairly difficult to do, so we’ll see how this goes.

I’m still super excited to be working on Catarang two years after I first came up with the idea, which is great. I’d love to be able to release it and have people other than myself use it on a day to day basis so I keep building it with that in mind. I know that if I got some dedicated time to work on it I could really build it up into something special, so I’ll have to slowly work in that direction since I can only use spare time right now. If you want to keep up to date on my progress you can follow me on Twitter or follow Catarang on GitHub.

Beer and Programming: 2

Beer:

20131002_205031
20131003_205928
20131004_202003
20131005_193830

Idjit! – Dugges Ale- & Porterbryggeri AB: I generally love stouts and especially imperial stouts, but Idjit just didn’t grab me. It was incredibly thick (at points almost feeling chunky) and just had too heavy of a taste. It felt like they tried to make the imperial stouts to end imperial stouts but just went too far. Not really worth picking up.
Punkin Ale – Dogfish Head Brewery: Dogfish Head always puts out one of my favorite pumpkin beers, and this year is no exception. As far as pumpkin beers go, it’s on the lighter end with the pumpkin flavor not overwhelming you, but instead being a nice accompaniment to the brown beer. I think I still like the Elysian’s: The Great Pumpkin better, but that’s just because it matches the styles I tend to like more. Good for a pumpkin style beer.
Flipside Red IPA – Sierra Nevada Brewing Co.: Reds are my favorite beer type, so seeing a Red IPA was intriguing to me as I’d never known that was a thing. At first taste, it definitely reminds you of a solid red, but then you notice a bitterness to it from the IPA part afterward. It’s definitely a decent beer, but I think in the end I’d rather have a red or an IPA, I don’t think they need to be combined. Good for a red IPA; something to try once, but not a regular beer.
Narwhal Imperial Stout – Sierra Nevada Brewing Co.: I can’t speak highly enough of Narwhal. It’s one of my favorite Imperial Stouts, if not my absolute favorite. It’s thick, but not especially so like the Idjit. It has hints of coffee, but isn’t overpowering. And lastly, it’s fucking delicious. I love it so much, I’ve picked up two four packs to put in my beer cabinet to save for later when I can’t get it anymore. Pick up this beer if you see it, you won’t regret it.

Programming:

I haven’t done much, if any programming at home this week, so I thought I’d talk a bit about some stuff I’ve already worked on in my engine. One of the first things I worked on was a memory manager, since C++ is not terribly good at tracking memory and letting you know when you’ve leaked it. Memory management is one of the hardest things for people to learn when using C++ if they’ve come from a managed language before and for good reason; it’s not something you learn via learning the language, but instead you learn by messing it up so many damn times that you get into a rhythm when programming new things. On windows, there is a very easy way to see when you leak memory when exiting the program (and potentially during, if you want to do snapshots).

As a primer for how Windows handles memory leak detection, you can read MSDN’s page about it, which has fairly good samples. The short of it is, you can include this code, and it’ll automatically spit out stuff at the close of the program to tell you which bytes are leaking.

// top of file
#define _CRTDBG_MAP_ALLOC
#include <stdlib.h>
#include <crtdbg.h>
// entry point into the application
int WINAPI WinMain(__in HINSTANCE hInstance, __in_opt HINSTANCE hPrevInstance, __in LPSTR lpCmdLine, __in int nShowCmd)
{
#if defined(_DEBUG) && defined(WIN32)
	_CrtSetDbgFlag(_CRTDBG_ALLOC_MEM_DF | _CRTDBG_LEAK_CHECK_DF);
#endif

If you were to do an: int* int_ptr = new int;
That will spit stuff out like:

Detected memory leaks!
Dumping objects ->
{570} normal block at 0x00E26948, 4 bytes long.
Data: < > CD CD CD CD
Object dump complete.

Unfortunately, this isn’t terribly helpful! It might be if the data were a string, but when the data is an object, the output is relatively worthless. If you get lucky, the leak will occur at the same time every time, and if that is the case, you can put breakpoints on the creation of the data (the 570 in this example means it was the 570’th memory allocation).

So where do we go from here? A memory manager is the next step. We want to be able to both track our allocations better, and optimize how memory is created at run-time so we don’t have lots of little allocations slowing us down. There are two real ways to go about doing this; you can either overload the new operator or create an insertion point into which all memory is allocated and track it that way. The former is easier, but a bit messier, and the latter is something you probably want to do in the long run. I’ve done both before and in this instance I decided to go the easier route by overloading new and delete.

You can track all uses of new by doing:

inline void* operator new(size_t size, const char* file, const int line)
{
	void* ptr = malloc(size);
	MemoryManager::GetMemoryManager()->AddAllocation(ptr, size, file, line);
	return ptr;
}
inline void operator delete(void* ptr, const char*, const int) throw()
{
	MemoryManager::GetMemoryManager()->RemoveAllocation(ptr);
	free(ptr);
}

#ifdef ENABLE_MEMORY_MANAGER
	#define DBG_NEW new(__FILE__, __LINE__)
#else
	#define DBG_NEW new
#endif // ENABLE_MEMORY_MANAGER

You have to not just overload those functions, but all of the different types of new/delete (including the ones you created by redefining new as DBG_NEW). You can see all of them here. I’ve had some odd problems with code that overloads new, but doesn’t include the throw() part in the functions (even when I have no exceptions turned on), so it may take some fanangling before they work, but they should be good.

For the internals of the AddAllocation and RemoveAllocation functions, they’re pretty simple. They’re just a map of the pointer to the data that was allocated (size, filename, and line number).

struct MemoryAllocation
{
	MemoryAllocation() : _size(0), _filename("default"), _line_number(0) { }
	MemoryAllocation(const unsigned int size, const char* filename, const int line_number)
		: _size(size), _filename(filename), _line_number(line_number)
	{
		// do nothing
	}
	unsigned int _size;
	const char* _filename;
	int _line_number;
};

void MemoryManager::AddAllocation(const void* ptr, const unsigned int size, const char* file, const int line)
{
	_allocations[ptr] = MemoryAllocation(size, file, line);
}

void MemoryManager::RemoveAllocation(const void* ptr)
{
	_allocations.erase(ptr);
}

Record all of that information and you’re good to go! The example we had above with the integer being allocated (but not deleted) now looks like this when we spit the information out on close of the application:

c:programmingsideprojectsourcemain.cpp(48): Memory Leak at 0x00E26948 size: 4 bytes

This is MUCH more useful. This still doesn’t get us all the way to complete usefulness, however. What if you have a function like:

int* CreateInt()
{
	return new int;
}

That line and file will always be the same and never return what is actually important, the original caller! So how do you find out the important information? You get the callstack and keep that around in addition to the other information. I haven’t done that yet, as it’s platform specific (and in some cases a pain in the ass), but Google is your friend. It’s next on my list if I need to make any additions to my memory manager. This will allow you to see where all allocations are created from.

So how does this break down and why did I say it was the lazy way to go? Good questions! First, it breaks down when you have things like different libraries or third party dlls that will allocate/deallocate memory. It depends on how they’re created, but generally speaking you shouldn’t expect third party code to ever allow you to track these kind of allocations. In addition to that, because you’re overriding new, that means that other people are able to override new as well (including your coworkers or friends)! This means that if you compile with code that’s not yours and it does bad things, it could completely negate what you’re trying to do. It’s the lazy way to go because it works if everything you work with uses new. As a counter example, the standard library, like std::vector, will not allocate via new, so you can never see memory stats via that method of memory allocation. C++ is really hard to track all memory allocated in a program. Be wary of static allocations as well.

In the end, this is a very easy way for C++ applications to track memory allocations, but not expend too much time to write a true memory allocator like the Bitsquid engine.

Please give me ideas and/or comments about this post as I’d like Beer and Programming to become a weekly post. It takes a lot of work for me to do, but I enjoy it so far, so I’m going to continue it until I either run out of ideas or people stop caring about my posts.

Beer and Programming: 1

Beer and Programming: 1

I’ve been lamenting the fact that I don’t regularly update this blog, and I think the best way for me to fix that is to post about the things that I’m actively doing. I love all different kinds of beer, and I learn as much as possible about programming, so why not combine the two into a weekly roundup? I can talk about the different beer that I’ve had during the week and the different programs I’ve worked on as a professional and a hobbyist.

This week, let’s start off with the beer. First off, I tried out the Epic Imperial: Red Ale.

20130929_183938_h6wjfbbzag

It was exactly what you’d expect from an Imperial Red. It was a Red, with a bit of a kick to it. I think I need to come up with some sort of rating system for beer, so I’m going to say there’s the 3 ratings of:

    * Amazing and you should pick up regardless of what else you see
    * Good for its type or good for a different type of beer
    * Not really worth picking up.

Epic Imperial Red Ale leans toward the amazing and you should pick up, but not quite getting there. It’s a solid beer that is done really well, but not exceptional.

The second beer I tried out was the Elysian: The Great Pumpkin.

20130929_200157_cgw1ph2e6m

I love pumpkin beers, and this is no exception. I’d tried their stout pumpkin and been underwhelmed, but this one was quite good. The Great Pumpkin was solid without being overwhelming. If you can pick it up, I’d say it manages to come in the bottom of “Amaazing and you should pick it up”. If you don’t like pumpkin, you should stay away, but otherwise, it’s pretty damn good.

On to programming. I’ve been trying to learn OpenGL lately and have been having a hard time finding tutorials that are focused more toward people that know how to program, but don’t know how to program graphics. It’s surprisingly hard to do so. Right now, I’m using rastertek.com‘s and arcsynthesis‘s examples. Neither of which is particularly good for how to structure a game’s graphics engine, but they each teach OpenGL in their own way that helps out a little. I wish I could find an OpenGL for game programmer’s website, but, alas, I have been unable to do so.

Work on the engine has been going well, however, as I’ve been understanding more and more as of late. I enjoy the knowledge even if I sometimes hate how hard and different it is from my everyday, normal programming. As of now, I have a triangle rendering, but soon hope to have textures or generic polygons.

Beer and Programming: 2

Little Code Tricks

I think that programming is 75% knowing the language and 25% knowing little tricks you can do with it to make your life easier. Here’s one that I thought I’d post about (although it’s super basic).

When you’re creating a vector, matrix, or some other well known storage unit class you often want to access the data in multiple ways. A simple way to do it without adding any accessors is to do this:

union
{
  struct
  {
    float x, y, z;
  };
  float m[3];
};

This allows you to both access it using x, y, and z like normal, but you can also index into the same memory using m[0], m[1], and m[2] without adding the extra overhead of another variable or accessor!

Yay little tricks :).

Programming at Home

I’ve started programming at home once again, after a nine month break. It’s kind of bizarre to think that I program at work for 8-10 hours a day, and then after that, sometimes I still want to program more at home. I’m starting from scratch, yet again, but kind of not at the same time. I’m taking pieces from my previous two personal projects and combining them together. One of them I haven’t changed since October 18, 2008, the other August 8, 2010. I’ve learned quite a lot in 3+ years from the last time I tried to do something like I’m attempting, and my style has changed since the last project as well.

It’s going to take a while to get up and running, but the interesting part is that I know what I’m working on and how it’s going to play out and help me toward the future. I’m compartmentalizing things much, much better than I had previously done, and once I get the setup going, it should be easy to add projects to the group to combine to a bigger whole.

If it sounds kind of vague and nebulous, that’s because it is. I’ll be writing technical posts in the future as I go on to help myself and others possibly learn some things that I’ve picked up over the years.

What’s up first? Config files. So simple, yet so incredibly powerful. Hopefully that post will come soon.