LinkedIn University Pages are a case study in building big data apps the right way

Professional network LinkedIn rolled out its new University Pages feature … to much fanfare, but the pages are as much as a matter of smart engineering as they are of smart business strategy, Gigaom reports.

… LinkedIn Engineering published a blog post explaining the technology behind University Pages, and it underscores the importance of understanding your products, your data and the tools you need to process it.

Of course the new product started with an idea, but after that, blog post author Josh Clemm noted, the company’s data scientists spent years combing through member profiles, gathering and standardizing data about 23,000 colleges and universities. They built graph data models for each school, with the school as the primary node and things like related schools and LinkedIn-member alumni as secondary ones.

That’s why you’re now able to visit any school’s LinkedIn profile (at least the ones that have been updated to the new format) and see the same information, such as where alumni work and who attended.

Under the covers, University Pages runs atop some serious big data technologies, many of which LinkedIn built itself. Those graphs are all stored in LinkedIn’s new flagship database technology, EspressoDB. Hadoop powered much of the work involved in getting all that data into a standard format.

Read more