Before submitting pull request features, please discuss them with us first by opening an issue or a discussion. We welcome new/junior/starting developers. Feel free to join to our Discord channel for help and guidance.
If you would like to start working on an issue, please comment on the issue on GitHub, so that we can assign you to that issue.
Make sure to take a look at the project's style guide.
The following guide details the steps to setup a local development environment for mirrord and run the E2E tests.
- GCC - only on Linux, GCC is needed for Go dynamic linking
- Rust
- NodeJS, ExpressJS
- Python, Flask, FastAPI
- Go
- Kubernetes Cluster (local/remote)
For E2E tests and testing mirrord manually you will need a working Kubernetes cluster. A minimal cluster can be easily setup locally using either of the following:
For the ease of illustration and testing, we will conform to using Minikube for the rest of the guide.
Download Minikube
Start a Minikube cluster with preferred driver. Here we will use the Docker driver.
minikube start --driver=docker
Build mirrord-agent Docker Image.
Make sure you're logged in to GHCR.
Then run:
docker buildx build -t test . --file mirrord/agent/Dockerfile
❯ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
test latest 5080c20a8222 2 hours ago 300MB
Note: mirrord-agent is shipped as a container image as mirrord creates a job with this image, providing it with elevated permissions on the same node as the impersonated pod.
Load mirrord-agent image to Minikube.
minikube image load test
Switch Kubernetes context to minikube
.
kubectl config get-contexts
kubectl config use-context minikube
The E2E tests create Kubernetes resources in the cluster that kubectl is configured to use and then run sample apps with the mirrord CLI. The mirrord CLI spawns an agent for the target on the cluster, and runs the test app, with the layer injected into it. Some test apps need to be compiled before they can be used in the tests (this should be automated in the future).
The basic command to run the E2E tests is:
cargo test --package tests
However, when running on macOS a universal binary has to be created first:
scripts/build_fat_mac.sh
And then in order to use that binary in the tests, run the tests like this:
MIRRORD_TESTS_USE_BINARY=../target/universal-apple-darwin/debug/mirrord cargo test -p tests
If new tests are added, decorate them with cfg_attr
attribute macro to define what the tests target.
For example, a test which only tests sanity of the ephemeral container feature should be decorated with
#[cfg_attr(not(feature = "ephemeral"), ignore)]
On Linux, running tests may exhaust a large amount of RAM and crash the machine. To prevent this, limit the number of concurrent jobs by running the command with e.g. -j 4
The Kubernetes resources created by the E2E tests are automatically deleted when the test exits. However, you can preserve resources from failed tests for debugging. To do this, set the MIRRORD_E2E_PRESERVE_FAILED
variable to any value.
MIRRORD_E2E_PRESERVE_FAILED=y cargo test --package tests
All test resources share a common label mirrord-e2e-test-resource=true
. To delete them, simply run:
kubectl delete namespaces,deployments,services -l mirrord-e2e-test-resource=true
The layer's integration tests test the hooks and their logic without actually using a Kubernetes cluster and spawning an agent. The integration tests usually execute a test app and load the dynamic library of the layer into them. The tests set the layer to connect to a TCP/IP address instead of spawning a new agent. The tests then have to simulate the agent - they accept the layer's connection, receive the layers messages and answer them as the agent would.
Since they do not need to create Kubernetes resources and spawn agents, the integration tests complete much faster than the E2E tests, especially on GitHub Actions.
Therefore, whenever possible we create integration tests, and only resort to E2E tests when necessary.
Some test apps need to be compiled before they can be used in the tests (this should be automated in the future).
The basic command to run the integration tests is:
cargo test --package mirrord-layer
However, when running on macOS a dylib has to be created first:
scripts/build_fat_mac.sh
And then in order to use that dylib in the tests, run the tests like this:
MIRRORD_TEST_USE_EXISTING_LIB=../../target/universal-apple-darwin/debug/libmirrord_layer.dylib cargo test -p mirrord-layer
These tests will try writing the mirrord-intproxy
logs to a file in /tmp/intproxy_logs
(the dir will be created if it doesn't exist), the file name should be the same as the test name,
e.g. /tmp/intproxy_logs/node_close_application_1_Application__NodeFileOps.log
.
If log file creation fails, then you should see the logs in stderr
.
When running these in CI, an artifact is produced (scroll to Artifacts
which is under
the Actions
-> Summary
page) with all the test log files that could be created.
From the root directory of the mirrord repository, create a new testing deployment and service:
kubectl apply -f sample/kubernetes/app.yaml
sample/kubernetes/app.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: py-serv-deployment
labels:
app: py-serv
spec:
replicas: 1
selector:
matchLabels:
app: py-serv
template:
metadata:
labels:
app: py-serv
spec:
containers:
- name: py-serv
image: ghcr.io/metalbear-co/mirrord-pytest:latest
ports:
- containerPort: 80
env:
- name: MIRRORD_FAKE_VAR_FIRST
value: mirrord.is.running
- name: MIRRORD_FAKE_VAR_SECOND
value: "7777"
---
apiVersion: v1
kind: Service
metadata:
labels:
app: py-serv
name: py-serv
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
nodePort: 30000
selector:
app: py-serv
sessionAffinity: None
type: NodePort
Verify everything was created after applying the manifest
❯ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 3h13m
py-serv NodePort 10.96.139.36 <none> 80:30000/TCP 3h8m
❯ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
py-serv-deployment 1/1 1 1 3h8m
❯ kubectl get pods
NAME READY STATUS RESTARTS AGE
py-serv-deployment-ff89b5974-x9tjx 1/1 Running 0 3h8m
To build this project, you will first need a Protocol Buffer Compiler installed.
scripts/build_fat_mac.sh
cargo build
The binary is created at ./target/<platform>/debug/mirrord
Sample web server - app.js
(present at sample/node/app.mjs
in the repo)
sample/node/app.mjs
import { Buffer } from "node:buffer";
import { createServer } from "net";
import { open, readFile } from "fs/promises";
async function debug_file_ops() {
try {
const readOnlyFile = await open("/var/log/dpkg.log", "r");
console.log(">>>>> open readOnlyFile ", readOnlyFile);
let buffer = Buffer.alloc(128);
let bufferResult = await readOnlyFile.read(buffer);
console.log(">>>>> read readOnlyFile returned with ", bufferResult);
const sampleFile = await open("/tmp/node_sample.txt", "w+");
console.log(">>>>> open file ", sampleFile);
const written = await sampleFile.write("mirrord sample node");
console.log(">>>>> written ", written, " bytes to file ", sampleFile);
let sampleBuffer = Buffer.alloc(32);
let sampleBufferResult = await sampleFile.read(buffer);
console.log(">>>>> read ", sampleBufferResult, " bytes from ", sampleFile);
readOnlyFile.close();
sampleFile.close();
} catch (fail) {
console.error("!!! Failed file operation with ", fail);
}
}
// debug_file_ops();
const server = createServer();
server.on("connection", handleConnection);
server.listen(
{
host: "localhost",
port: 80,
},
function () {
console.log("server listening to %j", server.address());
}
);
function handleConnection(conn) {
var remoteAddress = conn.remoteAddress + ":" + conn.remotePort;
console.log("new client connection from %s", remoteAddress);
conn.on("data", onConnData);
conn.once("close", onConnClose);
conn.on("error", onConnError);
function onConnData(d) {
console.log("connection data from %s: %j", remoteAddress, d.toString());
conn.write(d);
}
function onConnClose() {
console.log("connection from %s closed", remoteAddress);
}
function onConnError(err) {
console.log("Connection %s error: %s", remoteAddress, err.message);
}
}
RUST_LOG=debug target/debug/mirrord exec -i test -l debug -c --target pod/py-serv-deployment-ff89b5974-x9tjx node sample/node/app.mjs
Note: You need to change the pod name here to the name of the pod created on your system.
.
.
.
2022-06-30T05:14:01.592418Z DEBUG hyper::proto::h1::io: flushed 299 bytes
2022-06-30T05:14:01.657977Z DEBUG hyper::proto::h1::io: parsed 4 headers
2022-06-30T05:14:01.658075Z DEBUG hyper::proto::h1::conn: incoming body is empty
2022-06-30T05:14:01.661729Z DEBUG rustls::conn: Sending warning alert CloseNotify
2022-06-30T05:14:01.678534Z DEBUG mirrord_layer::sockets: getpeername hooked
2022-06-30T05:14:01.678638Z DEBUG mirrord_layer::sockets: getsockname hooked
2022-06-30T05:14:01.678713Z DEBUG mirrord_layer::sockets: accept hooked
2022-06-30T05:14:01.905378Z DEBUG mirrord_layer::sockets: socket called domain:30, type:1
2022-06-30T05:14:01.905639Z DEBUG mirrord_layer::sockets: bind called sockfd: 32
2022-06-30T05:14:01.905821Z DEBUG mirrord_layer::sockets: bind:port: 80
2022-06-30T05:14:01.906029Z DEBUG mirrord_layer::sockets: listen called
2022-06-30T05:14:01.906182Z DEBUG mirrord_layer::sockets: bind called sockfd: 32
2022-06-30T05:14:01.906319Z DEBUG mirrord_layer::sockets: bind: no socket found for fd: 32
2022-06-30T05:14:01.906467Z DEBUG mirrord_layer::sockets: getsockname called
2022-06-30T05:14:01.906533Z DEBUG mirrord_layer::sockets: getsockname: no socket found for fd: 32
2022-06-30T05:14:01.906852Z DEBUG mirrord_layer::sockets: listen: success
2022-06-30T05:14:01.907034Z DEBUG mirrord_layer::tcp: handle_listen -> listen Listen {
fake_port: 51318,
real_port: 80,
ipv6: true,
fd: 32,
}
Server listening on port 80
Send traffic to the Kubernetes Pod through the service
curl $(minikube service py-serv --url)
Check the traffic was received by the local process
.
.
.
2022-06-30T05:17:31.877560Z DEBUG mirrord_layer::tcp: handle_incoming_message -> message Close(
TcpClose {
connection_id: 0,
},
)
2022-06-30T05:17:31.877608Z DEBUG mirrord_layer::tcp_mirror: handle_close -> close TcpClose {
connection_id: 0,
}
2022-06-30T05:17:31.877655Z DEBUG mirrord_layer::tcp: handle_incoming_message -> handled Ok(
(),
)
2022-06-30T05:17:31.878193Z WARN mirrord_layer::tcp_mirror: tcp_tunnel -> exiting due to remote stream closed!
2022-06-30T05:17:31.878255Z DEBUG mirrord_layer::tcp_mirror: tcp_tunnel -> exiting
OK - GET: Request completed
Debugging mirrord can get hard since we're running from another app flow, so the fact we're debugging might affect the program and make it unusable/buggy (due to sharing stdout with scripts/other applications).
The recommended way to do it is to use mirrord-console
. It is a small application that receives log information from different mirrord instances and prints it, controlled via RUST_LOG
environment variable.
To use mirrord console, run it:
cargo run --bin mirrord-console --features binary
Then run mirrord with the environment variable:
MIRRORD_CONSOLE_ADDR=127.0.0.1:11233
To see logs from the internal proxy, use the mirrord console.
To debug it with a debugger:
- Add
to somewhere in the start of the intproxy code.
tokio::time::sleep(Duration::from_secs(20)).await;
- Set breakpoints in vscode in the relevant lines of the intproxy code.
- Build mirrord.
- Run mirrord.
- Attach a debugger in vscode to the inproxy process. On macOS you can do that with
Cmd
+Shift
+P
->LLDB: Attach to Process...
-> typeintproxy
and choose themirrord intproxy
process. The sleep you added at the start of the intproxy is time for you to attach the debugger.
By default, the agent's pod will complete and disappear shortly after the agent exits. In order to be able to retrieve
the agent's logs after it crashes, set the agent's pod's TTL to a comfortable number of seconds. This configuration can
be specified either as a command line argument (--agent-ttl
), environment variable (MIRRORD_AGENT_TTL
), or in a
configuration file:
[agent]
ttl = 30
Then, when running with some reasonable TTL, you can retrieve the agent log like this:
kubectl logs -l app=mirrord --tail=-1 | less -R
This will retrieve the logs from all running mirrord agents, so it is only useful when just one agent pod exists.
If there are currently multiple agent pods running on your cluster, you would have to run
kubectl get pods
and find the name of the agent pod you're interested in, then run
kubectl logs <YOUR_POD_NAME> | less -R
where you would replace <YOUR_POD_NAME>
with the name of the pod.
Adding a feature to mirrord that introduces a new hook (file system, network) can be tricky and there are a lot of edge cases we might need to cover.
In order to have a more structured approach, here’s the flow you should follow when working on such a feature.
- Start with the use case. Write an example use case of the feature, for example “App needs to read credentials from a file”.
- Write a minimal app that implements the use case - so in the case of our example, an app that reads credentials from a file. Start with either Node or Python as those are most common.
- Figure out what functions need to be hooked in order for the behavior to be run through the mirrord-agent instead of locally. We suggest using
strace
. - Write a doc on how you would hook and handle the cases, for example:
- To implement use case “App needs to read credentials from a file*”
- I will hook
open
andread
handling calls only with flags O_RDONLY. - Once
open
is called, I’ll send a blocking request to the agent to open the remote file, returning the return code of the operation. - Create an fd using
memfd
. The result will be returned to the local app, and if successful we’ll save that fd into a HashMap that matches between local fd and remote fd/identifier. - When
read
is called, I will check if the fd being read was previously opened by us, and if it is we’ll send a blockingread
request to the agent. The result will be sent back to the caller. - And so on.
- This doc should go later on to our mirrord docs for advanced developers so people can understand how stuff works
- After approval of the implementation, you can start writing code, and add relevant e2e tests.
The mirrord-agent
crate makes use of the #[cfg(target_os = "linux")]
attribute to allow the whole repo to compile on MacOS when you run cargo build
.
To enable mirrord-agent
code analysis with rust-analyzer:
- Install additional targets
rustup target add x86_64-unknown-linux-gnu
rustup target add aarch64-apple-darwin
rustup target add x86_64-apple-darwin
rustup target add aarch64-unknown-linux-gnu
- Add additional targets to your local
.cargo/config.toml
block:
[build]
target = [
"aarch64-apple-darwin",
"x86_64-apple-darwin",
"x86_64-unknown-linux-gnu",
"aarch64-unknown-linux-gnu",
]
If you're using rust-analyzer VSCode extension, put this block in .vscode/settings.json
as well:
{
"rust-analyzer.check.targets": [
"aarch64-apple-darwin",
"x86_64-apple-darwin",
"x86_64-unknown-linux-gnu",
"aarch64-unknown-linux-gnu"
]
}