Open Source Friday with Simulacrum - Simulate the GitHub API Like a Pro
By GitHub
Key Concepts
- Similacrum: An open-source toolset for simulating APIs and services, built on ExpressJS, designed for high-fidelity application previews and acceptance testing.
- Transport Layer Simulation: The core architectural principle of Similacrum, mimicking data communication at the network protocol level (e.g., HTTP, WebSockets) rather than deeper application logic.
- OpenAPI Schema: A standard for describing RESTful APIs, used by Similacrum to define routes, validate responses, and generate example data.
- In-memory Data Store (Immer): An immutable data store used within Similacrum to manage simulated data, allowing for mutations over time while maintaining a clear state.
- High-Fidelity Application Previews: The ability to test front-end or back-end code against realistic, yet simulated, API responses and behaviors, closely mirroring a production environment.
- Acceptance Testing: A level of software testing that verifies if the system meets business requirements, often facilitated by Similacrum's ability to control API responses and states.
- OAuth Simulation: Similacrum's capability to mimic complex authentication flows, such as OAuth, by generating tokens (e.g., JWTs) to trick systems under test into believing they are interacting with a real identity provider.
- Webhooks: A mechanism for simulators to communicate with each other or trigger external events, enabling the simulation of asynchronous interactions and complex system integrations.
- Incremental Development: The methodology of starting with a simple simulation (e.g., serving JSON) and gradually adding complexity and fidelity as needed.
GitHub Universe Experience and Open Source Engagement
The video begins with attendees expressing their enthusiasm for GitHub Universe, highlighting its energizing atmosphere, the diverse community, and the opportunity to connect with developers. The conference is praised for its blend of technical topics and personal interactions, fostering a "nerdy but also fun and interactive" vibe. Attendees appreciate the platform for learning and improving productivity, with one participant specifically mentioning gaining open-source insights and a "fresh new deck." The speaker, Jacob, notes the conference's approachable and friendly environment, contrasting it with more corporate-driven events.
The discussion transitions to open-source contributions, with the host, Cadesa, asking about good starting points. Jacob suggests triaging issues, looking for "good first issues" (e.g., Home Assistant has 2.9K issues with many labels, including "good first issue"), and contributing to projects one uses and cares about.
Jacob's Journey and Similacrum's Genesis
Jacob introduces himself as a former structural engineer who transitioned to software engineering. He has been involved in programming and open source for 10-15 years, initially as a hobby and later by writing code to automate tasks in his structural engineering role. He is part of larger projects like Tauri and is currently employed at Frontside, where Similacrum originated.
The inspiration for Similacrum dates back 8-10 years, stemming from a client project that required dealing with complex Bluetooth connectivity issues. To overcome the difficulty of testing with real devices and scaling multiple apps, Frontside developed a "Bluetooth simulator" that mimicked the Bluetooth connection at the communications layer. This approach allowed them to test app code thoroughly by simulating the transport layer, which proved to be a "natural abstraction." Over time, the generic and widely applicable pieces of this concept were open-sourced, leading to Similacrum.
Similacrum's Core Architecture and Functionality
Similacrum is designed to run offline and locally, built upon ExpressJS with a suite of helper functions. Jacob emphasizes that "there's nothing entirely special about it" in the sense that it uses common tools, making it easy to incrementally build and gain value. The key is that it allows developers to "fake the pieces that we need to" while maintaining a production-like codebase, ensuring that the code paths tested are the same as those run in production.
Similacrum is available as an npm package (e.g., @simolacrom/foundation-simulator) and includes a change log for tracking updates.
Step-by-Step Usage and Examples
- Basic JSON Serving: The simplest use case involves serving static JSON files. A
createFoundationSimulationServercan be configured to serve all files at a specific route. For example, starting a simulator onlocalhost:9090can return a JSON response for a GET request to/todos. The simulator's root endpoint typically serves as a status page with basic logs. - In-Memory Data Store: For more dynamic simulations, Similacrum includes an in-memory data store that uses Immer, an immutable data library. This allows developers to dump initial data into the store and then mutate it over time, providing a "high-fidelity" experience that feels close to interacting with a real database and API.
- OpenAPI Schema Integration: A powerful feature is the ability to integrate with OpenAPI schemas. Developers can provide an OpenAPI schema, and Similacrum will use it to define routes and handlers. This allows for:
- Validation: Responses can be validated against the OpenAPI spec, ensuring adherence to the API contract. If the spec changes, the tools will indicate what code needs modification.
- Multiple Schemas: Similacrum supports merging multiple OpenAPI documents, enabling front-end teams to develop and ship code against simulated API endpoints months before the back-end routes are fully implemented.
Advanced Simulators: GitHub API and Auth0
Jacob demonstrates the GitHub API Simulator, a more advanced package built on Similacrum. This simulator includes:
- A GraphQL server (using
create-yoga, a production-grade GraphQL server). - OAuth simulation: It generates JWTs and handles authentication flows sufficiently to "trick any system using it into thinking it's something like a real Auth0." This allows testing authentication paths without hitting actual identity providers.
extendAPI: Provides a way to drop into basic Express.js routes for custom handling (e.g., verifying tokens).- Leveraging OpenAPI Examples: The GitHub API simulator utilizes the examples embedded within the GitHub OpenAPI spec. If a specific route handler isn't defined, Similacrum falls back to serving the example data from the spec, effectively providing a "technically simulated" endpoint for the entire GitHub API without extensive manual configuration.
- Data Generation: It uses ZOD and Faker.js to build full-featured, realistic data from small initial chunks (e.g., providing an organization and repository name can generate a rich repository object).
Advantages Over Traditional Mocking/Stubbing
Jacob highlights several key differentiators for Similacrum:
- Flexibility and Broad Applicability: Because it's built on common tools like ExpressJS, Similacrum can simulate virtually any system or API, including those using WebSockets, AI APIs, Electron/Tauri apps, and mobile applications.
- Production-Like Testing: Similacrum keeps simulators "as close to production as possible," meaning developers don't have to modify their application code for testing. The same code paths are tested locally and in CI as are run in production, avoiding "caveats" or discrepancies.
- Testing Complex States: Similacrum excels at testing scenarios that are difficult with real APIs:
- Error States: Developers can easily configure simulators to return specific HTTP error codes (403, 401, 404, 500) and different error responses, allowing comprehensive testing of error handling in the application.
- Specific Data States: It's easy to put data into specific "bad states" that might be hard to achieve in a real system, enabling robust testing against edge cases and unexpected data.
- Case Study: Jacob recounts a client incident where an API change led to "object object" errors. Similacrum allowed them to recreate the exact error state, diagnose the root cause (a
try-catchblock expecting strings instead of objects), and confidently validate their fix. - Performance Testing: Similacrum can generate large datasets (e.g., 100,000 records) to test how a system (e.g., front-end paging) responds under load.
- Scalability: Similacrum is more lightweight than running multiple full-fledged services locally (e.g., "10 Java services").
GitHub Enterprise and AI Chat Simulation
Similacrum can simulate GitHub Enterprise instances. This is achieved not through a specific "enterprise mode," but by allowing configuration of different API_URL and API_Schema values. This is particularly useful for clients with older or customized GitHub Enterprise instances, which often have different URLs and stricter security rules (e.g., requiring 17 layers of security to create a repository). Similacrum allows developers to bypass these real-world constraints for testing purposes.
Regarding AI chat simulators, Jacob confirms that it's possible and a client has done it (though proprietary). Since chat interfaces often use WebSockets or server-sent events, Similacrum's underlying ExpressJS foundation can be extended with appropriate packages to handle these protocols.
Contributing and Community
Jacob encourages contributions to Similacrum, particularly to packages like Austere (the Auth0 simulator), which requires ongoing maintenance due to API changes. He emphasizes that sharing this work benefits the entire community. He invites users to open issues, join the Discord server for discussions, and share use cases. He notes that clients often use "double-digit custom simulators" for their various APIs, especially within open-source communities like Backstage (Spotify's internal developer platform), which naturally fits Similacrum's use case for meshing data from many sources.
Running in CI/CD and Inter-Simulator Communication
Similacrum servers can be run in CI/CD environments just like any other server, providing consistent testing locally and in CI. They can be used by testing frameworks like Playwright or for local development (e.g., quickly spinning up data for a front-end without a live backend).
A significant differentiator is the ability for multiple simulators to run together and communicate. Jacob demonstrates this with a "ping-back" example involving two simulators running on different ports (3050 and 3051).
- Webhooks: Simulators can trigger webhooks, which are essentially
fetchcalls between them. This allows for simulating asynchronous interactions and complex workflows where one service's action triggers another. - State Synchronization: In the example, a call to
external boopon simulator 3051 triggers a webhook to simulator 3050, which then increments aboopcounter on 3050. This demonstrates how simulators can communicate and synchronize their internal states, crucial for testing deep integrations (e.g., a GitHub API integration creating a repository and then receiving a webhook from GitHub). - Flexible Communication: Simulators can communicate via public API calls or through "back channels" by directly accessing each other's internal data stores.
Authentication and State Persistence
Similacrum's in-memory data store (using Immer) means that data is cleared upon restarting the simulator. However, developers can provide initial state (e.g., a default user for the Auth0 simulator) when starting the simulator, ensuring a consistent baseline. This ephemeral nature is a benefit, allowing developers to "do whatever you want and put things in a as bad of a state as you want and if you really muck something up, you restart it."
Documentation is available in the README files of each package and through the Discord server.
Roadmap and Future Directions
Jacob outlines the future plans for Similacrum:
- Webhooks: A near-term focus, as a concrete use case for event testing has recently emerged.
- Orchestration for Multiple Simulators: While previous efforts focused on a main server starting all simulators, the immediate goal was quick single-simulator startup. The next big push is to develop helpers for orchestrating multiple simulators, providing a service that can start them all, handle networking, and offer observability into their states. This is conceptually likened to a Kubernetes control plane, aiming to provide greater visibility into the simulators' existing states and how they change over time.
Conclusion
Similacrum offers a robust and flexible solution for API simulation, enabling high-fidelity testing and development across various platforms. By leveraging common tools like ExpressJS and OpenAPI, it allows developers to create realistic, production-like environments that are easy to set up, incrementally build upon, and integrate into CI/CD pipelines. Its ability to simulate complex scenarios like OAuth, error states, and inter-service communication via webhooks provides significant advantages over traditional mocking, fostering greater confidence in code deployments and accelerating development cycles. The project encourages community contributions and aims to further enhance the orchestration and observability of multiple simulators.
Chat with this Video
AI-PoweredHi! I can answer questions about this video "Open Source Friday with Simulacrum - Simulate the GitHub API Like a Pro". What would you like to know?