Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ants Deployment #41

Merged
merged 10 commits into from
Jan 14, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 0 additions & 5 deletions .gitmodules

This file was deleted.

1 change: 0 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@ RUN apk add --no-cache gcc musl-dev git

WORKDIR /build

COPY go-libp2p-kad-dht /build/go-libp2p-kad-dht/
COPY go.mod go.sum ./
RUN go mod download

Expand Down
6 changes: 1 addition & 5 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,6 @@ REPO?=019120760881.dkr.ecr.us-east-1.amazonaws.com
REPO_USER?=AWS
REPO_REGION?=us-east-1


tools:
go install -tags 'postgres,clickhouse' github.com/golang-migrate/migrate/v4/cmd/[email protected]

non-cluster-migrations:
mkdir -p db/migrations/local
for file in $(shell find db/migrations -maxdepth 1 -name "*.sql"); do \
Expand All @@ -27,7 +23,7 @@ local-migrate-down: non-cluster-migrations
migrate -database 'clickhouse://localhost:9000?username=ants_local&database=ants_local&password=password&x-multi-statement=true' -path db/migrations/local down

local-clickhouse:
docker run --name ants-clickhouse --rm -p 9000:9000 -e CLICKHOUSE_DB=ants_local -e CLICKHOUSE_USER=ants_local -e CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT=1 -e CLICKHOUSE_PASSWORD=password clickhouse/clickhouse-server
docker run --name ants-clickhouse --rm -p 9000:9000 -p 8123:8123 -e CLICKHOUSE_DB=ants_local -e CLICKHOUSE_USER=ants_local -e CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT=1 -e CLICKHOUSE_PASSWORD=password clickhouse/clickhouse-server

.PHONY: build
build:
Expand Down
98 changes: 75 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,36 @@
# Ants Watch

[![ProbeLab](https://img.shields.io/badge/made%20by-ProbeLab-blue.svg)](https://probelab.io)
![Build Status](https://img.shields.io/github/actions/workflow/status/probe-lab/ants-watch/ci.yml?branch=main)
![License](https://img.shields.io/github/license/probe-lab/ants-watch)

DHT Client Population Monitor.
Ants watch is a DHT client monitoring tool. It is able to log the activity of all nodes in a DHT network by
carefully placing _ants_ in the DHT keyspace. For nodes to utilize the DHT they need to perform routing table maintenance tasks.
These tasks consists of requesting other nodes close to oneself in the DHT keyspace. Ants watch ensures
that at least one of these requests will **always** hit one of the ants. If a request hits an ant we record information about the requesting peer like agent version,
supported protocols, IP addresses, and more.

<img src="./resources/ants.png" alt="Ants Watch" height="300"/>
**Supported networks:**

Authors: [guillaumemichel](https://github.com/guillaumemichel), [kasteph](https://github.com/kasteph)
* [Celestia](https://celestia.org/)
* Can be extended to support other networks using the [libp2p DHT](https://github.com/libp2p/specs/tree/master/kad-dht).


## Table of Contents

## Overview
- [Ants Watch](#ants-watch)
- [Table of Contents](#table-of-contents)
- [Methodology](#methodology)
- [Setup](#setup)
- [Configuration](#configuration)
- [Usage](#usage)
- [Queen](#queen)
- [Health](#health)
- [Ants key generation](#ants-key-generation)
- [License](#license)


## Methodology

* `ants-watch` is a DHT honeypot monitoring tool, logging the activity of all nodes in a DHT network.
* An `ant` is a lightweight [libp2p DHT node](https://github.com/libp2p/go-libp2p-kad-dht), participating in the DHT network, and logging incoming requests.
* `ants` participate in the DHT network as DHT server nodes. `ants` need to be dialable by other nodes in the network. Hence, `ants-watch` must run on a public IP address either with port forwarding properly configured (including local and gateway firewalls) or UPnP enabled.
* The tool releases `ants` (i.e., spawns new `ant` nodes) at targeted locations in the keyspace in order to _occupy_ and _watch_ the full keyspace.
Expand All @@ -24,41 +42,53 @@ Authors: [guillaumemichel](https://github.com/guillaumemichel), [kasteph](https:
* The `ant queen` is responsible for spawning, adjusting the number and monitoring the ants as well as gathering their logs and persisting them to a central database.
* `ants-watch` does not operate like a crawler, where after one run the number of DHT client nodes is captured. `ants-watch` logs all received DHT requests and therefore, it must run continuously to provide the number of DHT client nodes over time.

### Supported networks

* [Celestia](https://celestia.org/)
* Can be extended to support other networks using the [libp2p DHT](https://github.com/libp2p/specs/tree/master/kad-dht).

## Setup

Before installing dependencies:
### Prerequisites

``` shell
git submodule init
git submodule update --init --recursive --remote
```
You need go-migrate to run the clickhouse database migrations:

```shell
make tools

You'll also need to install some tools: `make tools`.
# or

go install -tags 'clickhouse' github.com/golang-migrate/migrate/v4/cmd/[email protected]
```

You need to setup a Clickhouse database with:
You can then start a Clickhouse database with:

```shell
make local-clickhouse

# or

docker run --name ants-clickhouse --rm -p 9000:9000 -p 8123:8123 -e CLICKHOUSE_DB=ants_local -e CLICKHOUSE_USER=ants_local -e CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT=1 -e CLICKHOUSE_PASSWORD=password clickhouse/clickhouse-server
```

This will start a Clickhouse server with the container name `ants-clickhouse` that's accessible on the non-SSL native port `9000`. Further database parameters are:
This will start a Clickhouse server with the container name `ants-clickhouse` that's accessible on the non-SSL native port `9000`. The relevant database parameters are:

* host: `localhost`
* port: `9000`
* username: `ants_local`
* password: `password`
* database: `ants_local`
* secure: `false`

Then you need to apply the migrations with:

```shell
make local-migrate-up
```

## Configuration
This will take the migration files in the `./db/migrations` directory and strip all the `Replicated` merge tree prefixes before applying the migrations.
The `Replicated` merge tree table engines only work with a clustered clickhouse deployment (e.g., clickhouse cloud). When
running locally, you will only have a single clickhouse instance, so applying `Replicated` migrations will fail.

I'm all ears how to improve the workflow here.

### Configuration

The following environment variables should be set for ants-watch:

Expand All @@ -74,18 +104,18 @@ ANTS_NEBULA_CONNSTRING=postgres://nebula:password@localhost/nebula?sslmode=disab

## Usage

Once the database is setup and migrations are applied, you can start the honeypot.
Once the database is set up and migrations are applied, you can start the honeypot.

`ants-watch` needs to be dialable by other nodes in the network. Hence, it must run on a public IP address either with port forwarding properly configured (including local and gateway firewalls) or UPnP enabled.

### Queen

In [`cmd/honeypot/`](./cmd/honeypot/), you can run the following command:
To start the ants queen, you can run the following command:

```sh
go run ./cmd/ants queen --upnp # for UPnP
# or
go run ./cmd/ants queen --first_port=<port> --num_ports=<count> # for port forwarding
go run ./cmd/ants queen --first.port=<port> --num.ports=<count> # for port forwarding
```

When UPnP is disabled, ports from `firstPort` to `firstPort + nPorts - 1` must be forwarded to the machine running `ants-watch`. `ants-watch` will be able to spawn at most `nPorts` distinct `ants`.
Expand All @@ -98,6 +128,28 @@ You can run a health check on the honeypot by running the following command:
go run . health
```

## Ants Key Generation

The queen ant periodically queries the [Nebula](https://github.com/dennis-tra/nebula) database to retrieve the list of connected DHT servers. Kademlia identifiers of these peers are then inserted into a [binary trie](https://github.com/guillaumemichel/py-binary-trie/). Using this binary trie, the queen defines keyspace zones of at most `bucket_size - 1` peers. One ant must be present in each of these zones in order to capture all DHT requests reaching the `bucket_size` closest peers to the target key.

Kademlia identifiers are derived from a libp2p peer id, which itself is derived from a cryptographic key pair. Hence generating a key matching a specific zone of the binary trie isn't trivial and requires bruteforce. All keys generated during the bruteforce are persisted on disk, because they may be useful in the future. When an ant isn't needed anymore, its key is marked as available for reuse. This also allows reusing the same peer ids for the ants across multiple runs of the honeypot.

## Related Efforts

- [hydra-booster](https://github.com/libp2p/hydra-booster) - A DHT Indexer node & Peer Router

## Maintainers

- [@guillaumemichel](https://github.com/guillaumemichel)
- [@kasteph](https://github.com/kasteph)
- [@dennis-tra](https://github.com/dennis-tra).

## Contributing

Feel free to dive in! [Open an issue](https://github.com/probe-lab/ants-watch/issues/new) or submit PRs.

Standard Readme follows the [Contributor Covenant](http://contributor-covenant.org/version/1/3/0/) Code of Conduct.

## License

This project is licensed under the MIT License - see the [LICENSE](./LICENSE) file for details.
[MIT](LICENSE) © ProbeLab
72 changes: 65 additions & 7 deletions ant.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,17 +3,19 @@ package ants
import (
"context"
"fmt"
"time"

"github.com/caddyserver/certmagic"
ds "github.com/ipfs/go-datastore"
p2pforge "github.com/ipshipyard/p2p-forge/client"
"github.com/libp2p/go-libp2p"
kad "github.com/libp2p/go-libp2p-kad-dht"
"github.com/libp2p/go-libp2p-kad-dht/ants"
pb "github.com/libp2p/go-libp2p-kad-dht/pb"
"github.com/libp2p/go-libp2p/core/connmgr"
"github.com/libp2p/go-libp2p/core/crypto"
"github.com/libp2p/go-libp2p/core/event"
"github.com/libp2p/go-libp2p/core/host"
"github.com/libp2p/go-libp2p/core/network"
"github.com/libp2p/go-libp2p/core/peer"
"github.com/libp2p/go-libp2p/core/peerstore"
"github.com/libp2p/go-libp2p/core/protocol"
Expand All @@ -23,6 +25,8 @@ import (
libp2pwebrtc "github.com/libp2p/go-libp2p/p2p/transport/webrtc"
libp2pws "github.com/libp2p/go-libp2p/p2p/transport/websocket"
libp2pwebtransport "github.com/libp2p/go-libp2p/p2p/transport/webtransport"
"github.com/multiformats/go-multiaddr"
mh "github.com/multiformats/go-multihash"
"github.com/probe-lab/go-libdht/kad/key/bit256"
"go.uber.org/zap"
)
Expand All @@ -32,13 +36,37 @@ const (
userAgent = "celestiant"
)

type RequestEvent struct {
Timestamp time.Time
Self peer.ID
Remote peer.ID
Type pb.Message_MessageType
Target mh.Multihash
AgentVersion string
Protocols []protocol.ID
Maddrs []multiaddr.Multiaddr
ConnMaddr multiaddr.Multiaddr
}

func (r *RequestEvent) IsIdentified() bool {
return r.AgentVersion != "" && len(r.Protocols) > 0 && len(r.Maddrs) > 0
}

func (r *RequestEvent) MaddrStrings() []string {
maddrStrs := make([]string, len(r.Maddrs))
for i, maddr := range r.Maddrs {
maddrStrs[i] = maddr.String()
}
return maddrStrs
}

type AntConfig struct {
PrivateKey crypto.PrivKey
UserAgent string
Port int
ProtocolPrefix string
BootstrapPeers []peer.AddrInfo
EventsChan chan ants.RequestEvent
RequestsChan chan<- RequestEvent
CertPath string
}

Expand All @@ -59,8 +87,8 @@ func (cfg *AntConfig) Validate() error {
return fmt.Errorf("bootstrap peers are not set")
}

if cfg.EventsChan == nil {
return fmt.Errorf("events channel is not set")
if cfg.RequestsChan == nil {
return fmt.Errorf("requests channel is not set")
}

return nil
Expand Down Expand Up @@ -154,7 +182,7 @@ func SpawnAnt(ctx context.Context, ps peerstore.Peerstore, ds ds.Batching, cfg *
kad.BootstrapPeers(cfg.BootstrapPeers...),
kad.ProtocolPrefix(protocol.ID(cfg.ProtocolPrefix)),
kad.Datastore(ds),
kad.RequestsLogChan(cfg.EventsChan),
kad.OnRequestHook(onRequestHook(h, cfg)),
}
dht, err := kad.New(ctx, h, dhtOpts...)
if err != nil {
Expand All @@ -179,7 +207,10 @@ func SpawnAnt(ctx context.Context, ps peerstore.Peerstore, ds ds.Batching, cfg *
logger.Debug("certificate loaded channel closed")
}()

sub, err := h.EventBus().Subscribe([]interface{}{new(event.EvtLocalAddressesUpdated), new(event.EvtLocalReachabilityChanged)})
sub, err := h.EventBus().Subscribe([]interface{}{
new(event.EvtLocalAddressesUpdated),
new(event.EvtLocalReachabilityChanged),
})
if err != nil {
return nil, fmt.Errorf("subscribe to event bus: %w", err)
}
Expand All @@ -205,7 +236,7 @@ func SpawnAnt(ctx context.Context, ps peerstore.Peerstore, ds ds.Batching, cfg *
default:
continue
}
logger.Infof(" [%d] %s %s/p2p/%s", i, actionStr, maddr.Address, h.ID())
logger.Infof("[%d] %s %s/p2p/%s", i, actionStr, maddr.Address, h.ID())
}
case event.EvtLocalReachabilityChanged:
logger.Infow("Reachability changed", "ant", h.ID(), "reachability", evt.Reachability)
Expand All @@ -226,6 +257,33 @@ func SpawnAnt(ctx context.Context, ps peerstore.Peerstore, ds ds.Batching, cfg *
return ant, nil
}

func onRequestHook(h host.Host, cfg *AntConfig) func(ctx context.Context, s network.Stream, req pb.Message) {
return func(ctx context.Context, s network.Stream, req pb.Message) {
remotePeer := s.Conn().RemotePeer()

agentVersion := ""
val, err := h.Peerstore().Get(remotePeer, "AgentVersion")
if err == nil {
agentVersion = val.(string)
}

maddrs := h.Peerstore().Addrs(remotePeer)
protocolIDs, _ := h.Peerstore().GetProtocols(remotePeer) // ignore error

cfg.RequestsChan <- RequestEvent{
Timestamp: time.Now(),
Self: h.ID(),
Remote: remotePeer,
Type: req.GetType(),
Target: req.GetKey(),
AgentVersion: agentVersion,
Protocols: protocolIDs,
Maddrs: maddrs,
ConnMaddr: s.Conn().RemoteMultiaddr(),
}
}
}

func (a *Ant) Close() error {
if err := a.sub.Close(); err != nil {
logger.Warnf("failed to close address update subscription: %s", err)
Expand Down
4 changes: 2 additions & 2 deletions cmd/ants/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -176,14 +176,14 @@ func main() {
Value: queenConfig.CertsPath,
},
&cli.IntFlag{
Name: "first_port",
Name: "first.port",
Usage: "First port ants can listen on",
EnvVars: []string{"ANTS_FIRST_PORT"},
Destination: &queenConfig.FirstPort,
Value: queenConfig.FirstPort,
},
&cli.IntFlag{
Name: "num_ports",
Name: "num.ports",
Usage: "Number of ports ants can listen on",
EnvVars: []string{"ANTS_NUM_PORTS"},
Destination: &queenConfig.NumPorts,
Expand Down
8 changes: 4 additions & 4 deletions db/client.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,14 @@ import (
"strconv"
"time"

"github.com/probe-lab/ants-watch/metrics"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/metric"

"github.com/ClickHouse/clickhouse-go/v2"
"github.com/ClickHouse/clickhouse-go/v2/lib/driver"
logging "github.com/ipfs/go-log/v2"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/metric"
"golang.org/x/net/proxy"

"github.com/probe-lab/ants-watch/metrics"
)

var logger = logging.Logger("db")
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
ALTER TABLE requests
DROP COLUMN conn_maddr;
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
ALTER TABLE requests
ADD COLUMN conn_maddr String;
1 change: 1 addition & 0 deletions db/models.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,5 @@ type Request struct {
StartedAt time.Time `ch:"started_at"`
KeyID string `ch:"key_multihash"`
MultiAddresses []string `ch:"multi_addresses"`
ConnMaddr string `ch:"conn_maddr"`
}
1 change: 0 additions & 1 deletion go-libp2p-kad-dht
Submodule go-libp2p-kad-dht deleted from 8d2a9d
Loading
Loading