Senior Software Engineer - Data Architecture

dataarchitecture systemsdeveloper data bigdata webcrawling softwareengineer

SENIOR SOFTWARE ENGINEER, Data Architecture, Base: $160K - $180K,Austin/Remote

About the company

Datafiniti was founded with the vision of empowering people and organizations with data. If someone was building technology that required data to power it, we wanted to remove the hurdle of data acquisition so that that technology could come to life. Ten years later, we have brought that vision to life and are looking to expand it as far as possible. We currently have a wide variety of customers in finance, retail, proptech, and marketing, who use our technology to develop their products.

We collect a wide variety of information and content from thousands of sources and transform it all into highly-structured, instantly accessible data that can be integrated right away into any application or analysis someone is building. Startups and Fortune 500s alike use our various data sets and APIs to power thousands of solutions, including fraud prevention, investment algorithms, pricing analysis, mobile apps, lead generation, and much, much more.

In order to efficiently support such a wide variety of customers, we focus on building a highly flexible, robust, and scalable data infrastructure. Our technology is capable of ingesting, processing, and serving out billions of data points every day. Our small, close-knit team works together to develop technology and operational capabilities that allow us to meet the needs of an ever-expanding set of use cases.

Over the last two years, we have doubled our customer base and revenue each year and are on course to do so again this year! As we enter a new phase of growth, we are seeking to build a new engineering team from scratch and are specifically looking for people who are eager to "own" the technology and scale it for growth. This will be a unique opportunity that provides the experience similar to joining a start-up on Day 1, while already having an established customer base and strong technology foundation. We’re hoping to bring on new team members that are excited to work on the challenges of our unique business and push the boundaries of what “scale” truly means.

About the role

We wish to hire an experienced software engineer who has a strong passion for developing highly-scalable data architectures. A significant part of our technology stack is dedicated to:

The ingestion of semi-structured data from the web and other third-party sources
Processing of this data to fit our own schema and quality standards
Storing it in a large Elasticsearch cluster
Serving out the data via a public-facing API to our end-users

At a high level, this engineer will be responsible for the maintenance, improvement, and expansion of this stack. The role is a great fit for someone who gets excited at the prospect of developing software for a highly-scalable data architecture.

Responsibilities

Specific responsibilities can be broken down into two categories:

Maintenance of existing systems

Maintaining and improving existing ETL and data stream pipelines that focus on data ingestion and normalization
Maintaining and optimizing a public-facing API that allows our end-users to query all of the data we have available (This API receives millions of requests per day)
Maintaining and tuning four large Elasticsearch clusters

Feature development

Implementing new API features that make it easier for our end-users to access the data we have available
Working with other teams to re-envision the core underlying data model that enables our end-users to do more with less effort.
Architecting and implementing a microservice-oriented system that centers around the re-envisioned data model

Additional responsibilities include:

Diagnosing and fixing highly complex technical issues independently
Supporting the build and deployment pipeline and, when necessary, diagnosing and solving production support issues
Communicating individual and project-level development statuses, issues, risks, and concerns to technical leadership and management
Identifying and communicating cross-team dependencies to respective peers
Writing specification documents that include the feature-set being developed, explaining how these features will be implemented, and gaining stakeholder approval for the feature-set
Conducting thorough QA as a part of the development lifecycle prior to a production release

Qualifications

Specific technologies required for this role include:

Java (5+ years experience): The vast majority of our data-ingestion and normalization microservices are written in Java
Node.JS (2+ years experience): Our public-facing API is written in nodeJS
Express.js or a similar API framework such as Hapi, Koa, or Restify: Used by our public-facing API
Elasticsearch: We use Elasticsearch as the primary datastore for all of our normalized web data, with each cluster containing anywhere between 100 million to 3 billion records
MongoDB: Used as our primary user database
MySQL: Used by our microservices as their primary database
Redis: Used to maintain global state within our distributed system
RabbitMQ: Used as our primary message bus
AWS and Docker: All of our microservices are containerized via Docker and deployed to AWS

This role will also require a deep understanding of the following concepts or skills:

Thorough understanding of distributed systems and how to make them reliable, scalable and maintainable
Experience with large high-volume ETL pipelines
Experience with setting up and optimizing high-volume data-stream pipelines
Extensive experience maintaining and tuning Elasticsearch
Deep understanding of the design, implementation, and consumption of REST APIs
Excellent verbal and written communication skills
Strong analytical, problem solving, debugging and troubleshooting skills

Additional skills that will be considered a plus but are not required:

StatsD / Graphite / Graphana

Compensation and benefits for this role include:

$160K -$180K annual base salary
Equity in a company that is doubling its revenue every year
Comprehensive health insurance (medical, dental, vision, life)
Unlimited PTO, but 15 days MINIMUM required, preferably more.
7 federal holidays + 4 quarterly company-wide holidays

Additional benefits include:

Highly flexible work-life balance: If working in Austin, we ask you work from the office at least twice a week (for team bonding!), and you are free to work from home at your own schedule otherwise.
High degree of job autonomy: Team members are encouraged to experiment with their own implementations, propose ideas for company needs, and explore new solutions.
Career development: Executive leaders work with team members to align personal goals with company goals.
A supportive team environment: Team culture is focused on providing a supportive and positive environment for everyone.

NOTE: We prefer to hire someone who can work from our Austin office, but are open to hiring remote for someone who is a great fit for the role.

Job Type: Full-time

Pay: $160,000.00 - $180,000.00 per year

COVID-19 considerations:

Our office is open again. We are allowing fully vaccinated employees to work from the office without wearing masks. Anyone who is not yet fully vaccinated is asked to continue working from home.