Transparency, Trust and Testability: Our Data Engineering Manifesto
The backbone of our company is the data engineering capability we offer our clients.
Although the technologies and the design patterns used for data platforms evolves constantly, good engineering disciplines are universal. Our approach is to do the basics well - regardless of the stack and services you work with.
Our manifesto is simple:
Data processes should be simple and composable
Aggregations should be clearly written and testable
Transformations and data validation rules should be human-readable and reportable
The journey of every datum should be auditable back to source
If the data does not change then data processes should guarantee the same results each time they run
We believe - whether you are building a data warehouse, a data lake or a lakehouse - that these principles are fundamental and timeless. They are the key to building up to complexity.
Typical Tech Stacks
OLTP RDBMS: PostgreSQL, Aurora, Azure SQL, SQLite (embedded)
ETL: Prefect, DBT, Apache Airflow, Pentaho
OLAP: Redshift, Snowflake
Caching and Search: Elasticsearch, Redis, memcached
Events and Streaming: Lambda+SQS, Apache Kafka, RabbitMQ
Distributed Processing: AWS Glue, Apache Spark