2020-08-25
Unfortunately I pretty quickly ran in to a problem. I had three files. A key value store data structure header (kvslib.h
) with an implementation (kvslib.cpp
), and a main.cpp
that used this data structure. Let's start with the header file for the key value store.
kvslib.h
:
#include <string>
#include <map>
template <typename T>
class KeyValueStoreInterface
{
public:
virtual T get(std::string key) = 0;
virtual void put(std::string key, T value) = 0;
virtual void remove(std::string key) = 0;
};
template <typename T>
class InMemoryKVS : KeyValueStoreInterface<T>
{
private:
std::map<std::string, T> internalKeyValue;
public:
InMemoryKVS();
virtual T get(std::string key);
virtual void put(std::string key, T value);
virtual void remove(std::string key);
};
Here I'm defining a class, KeyValueStoreInterface
that is generic over type T
. It's has three functions get
, put
, remove
which are pretty self explanatory. Importantly these are "virtual" functions. Virtual functions are functions that you expect to be redefined in child classes. Even if you have a pointer to a parent class, you can call virtual functions on it and it will be forwarded to the derived class.
Below that is another class InMemoryKVS
that is a subclass of KeyValueStoreInterface
. It's basically exactly the same, including the fact that it is generic, except it says that it will have a private member variable internalKeyValue
that is a map of strings to T
.
kvslib.cpp
:
#include "kvslib.h"
template <typename T>
InMemoryKVS<T>::InMemoryKVS()
{
}
template <typename T>
T InMemoryKVS<T>::get(std::string key)
{
return this->internalKeyValue[key];
}
template <typename T>
void InMemoryKVS<T>::put(std::string key, T value)
{
this->internalKeyValue.insert(std::make_pair(key, value));
}
template <typename T>
void InMemoryKVS<T>::remove(std::string key)
{
this->internalKeyValue.erase(key);
The implementation file is pretty simple. It only implements functions for InMemoryKVS
and they work exactly how you'd expect.
main.cpp
:
#include <iostream>
#include "kvslib.h"
int main()
{
InMemoryKVS<int> x = InMemoryKVS<int>();
x.put("hello", 5);
std::cout << x.get("hello") << std::endl;
x.remove("hello");
std::cout << x.get("hello") << std::endl;
InMemoryKVS<std::string> y = InMemoryKVS<std::string>();
std::string str("world");
y.put("hello", str);
std::cout << y.get("foo") << std::endl;
return 0;
}
Finally I use the InMemoryKVS
from main. I make one that stores int
s and another that stores string
s. Simple!
Unfortunately compiling this results in a pretty inscrutable error:
bazel-out/k8-fastbuild/bin/_objs/kvs/main.pic.o:main.cpp:function main: error: undefined reference to 'InMemoryKVS<int>::InMemoryKVS()'
bazel-out/k8-fastbuild/bin/_objs/kvs/main.pic.o:main.cpp:function main: error: undefined reference to 'InMemoryKVS<int>::put(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int)'
bazel-out/k8-fastbuild/bin/_objs/kvs/main.pic.o:main.cpp:function main: error: undefined reference to 'InMemoryKVS<int>::get(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)'
bazel-out/k8-fastbuild/bin/_objs/kvs/main.pic.o:main.cpp:function main: error: undefined reference to 'InMemoryKVS<int>::remove(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)'
bazel-out/k8-fastbuild/bin/_objs/kvs/main.pic.o:main.cpp:function main: error: undefined reference to 'InMemoryKVS<int>::get(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)'
bazel-out/k8-fastbuild/bin/_objs/kvs/main.pic.o:main.cpp:function main: error: undefined reference to 'InMemoryKVS<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::InMemoryKVS()'
bazel-out/k8-fastbuild/bin/_objs/kvs/main.pic.o:main.cpp:function main: error: undefined reference to 'InMemoryKVS<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::put(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)'
bazel-out/k8-fastbuild/bin/_objs/kvs/main.pic.o:main.cpp:function main: error: undefined reference to 'InMemoryKVS<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::get(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)'
bazel-out/k8-fastbuild/bin/_objs/kvs/main.pic.o:main.cpp:vtable for InMemoryKVS<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >: error: undefined reference to 'InMemoryKVS<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::get(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)'
bazel-out/k8-fastbuild/bin/_objs/kvs/main.pic.o:main.cpp:vtable for InMemoryKVS<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >: error: undefined reference to 'InMemoryKVS<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::put(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)'
bazel-out/k8-fastbuild/bin/_objs/kvs/main.pic.o:main.cpp:vtable for InMemoryKVS<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >: error: undefined reference to 'InMemoryKVS<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::remove(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)'
bazel-out/k8-fastbuild/bin/_objs/kvs/main.pic.o:main.cpp:vtable for InMemoryKVS<int>: error: undefined reference to 'InMemoryKVS<int>::get(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)'
bazel-out/k8-fastbuild/bin/_objs/kvs/main.pic.o:main.cpp:vtable for InMemoryKVS<int>: error: undefined reference to 'InMemoryKVS<int>::put(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int)'
bazel-out/k8-fastbuild/bin/_objs/kvs/main.pic.o:main.cpp:vtable for InMemoryKVS<int>: error: undefined reference to 'InMemoryKVS<int>::remove(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)'
collect2: error: ld returned 1 exit status
What on earth?
I see a lot of undefined reference
s. But all of those functions are definitely defined. The fact that it specifies that it's looking for the <int>
version of the function is an important clue. But ... why would it? Shouldn't the generic version satisfy the case where T
is an int
?
In order to explain what's happening here we need to understand what C++ generics, also known as templates, actually do.
(This won't be a super in depth explanation, since I'm just learning this myself!)
To tell the C++ that a class is generic you use the template
keyword and that's for good reason: generics in C++ behave way more like Mustache or PHP than they do generics in other languages I'm used to. Take this line in my main.cpp
file where we create an InMemoryKVS
for int
s:
InMemoryKVS<int> x = InMemoryKVS<int>();
When the compiler sees this it generates a whole new version of InMemoryKVS
with all of the T
replaced by int
s, and then uses that as what x
points to. For example, the type signature of get
would become:
int InMemoryKVS<T>::get(std::string key)
With this in mind there's a plausible explanation for the above error message. The int
version of InMemoryKVS
isn't being generated. But why?
(This also won't be super in depth because C++ compilers are very complicated)
Remember back when you compiled your very first C(++) program? It probably looked something like this:
g++ main.cpp -o main
Later on you had a project that had two files some main file and some library file. You maybe compiled them like this:
g++ library.cpp -o library.o # compile the library
g++ main.cpp -o main.o # compile main
g++ main main.o library.o # Link the object code, creating an executable named main
(Even if you compiled them like g++ main.cpp library.cpp -o main
I think it is essentially still doing the above)
When the compiler is compiling main
it doesn't know anything about library
or vice versa. The translation units are completely independent. This is why header files are important: they allow the compiler to see the rough shape of files you're including without having to compile the entire thing.
Now we can understand what's happening with my key value store interface.
When the compiler compiles main it looks at main.cpp
and any header files that it includes. In this case that's only kvslib.h
(and iostream
but let's ignore that for now). kvslib.h
contains only the definitions for KeyValueStoreInterface
and InMemoryKVS
. This is sufficient to reference it.
However, when we reference InMemoryKVS<int>
in main the compiler can't generate a complete definition for it. Why? Because all it has is the header for that constructor, not the implementation. Is T
used to allocate more memory? It has no way of knowing. So it just doesn't generate it and assumes it will be defined later.
Separately the compiler also compiles kvslib.cpp
. This file also includes kvslib.h
. These files don't contain any usages of InMemoryKVS
at all, never mind an int
usage, so the compiler doesn't generate an int
version of InMemoryKVS
here either.
Then when the linker comes along to link these two translation units together it is only there that it sees there is no definition of InMemoryKVS<int>
available, so finally the linker fails.
The solution is simple. Move the implementation of InMemoryKVS
in to the header so that when the compiler compiles main
, which includes kvslib.h
it has the complete implementation so it can generate an int
version of InMemoryKVS
replete with function definitions.
This was, as a newcomer, super confusing and I'm now beginning to understand why people curse C++. :D
Things I'm going to look in to based off of this, that you might want to look in to as well if you find yourself here: