Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Other pings can appear in the results #52

Open
mpiraux opened this issue Feb 17, 2023 · 4 comments
Open

Other pings can appear in the results #52

mpiraux opened this issue Feb 17, 2023 · 4 comments

Comments

@mpiraux
Copy link

mpiraux commented Feb 17, 2023

Hello,
I've been running caracal on a machine that had a RIPE Atlas software probe running and found the RIPE pings to appear in the output csv. I guess the tools simply logs what is exchanged on the captured interface.

@maxmouchet
Copy link
Member

Hi,

It does capture all incoming ICMP, however the integrity check feature should drop ICMP replies not matching probes sent by caracal.

Can you share the command line you've been using to run caracal, as well as the kind of measurements (protocol, IPv4 or IPv6)?

@mpiraux
Copy link
Author

mpiraux commented Feb 21, 2023

Here is a CSV from one of these measurements, google.com.csv. It's a series of ping towards google.com, resolved in IPv4 and IPv6. In the log, you can see destination IP 2001:67c:2e8:3::c100:a4 appear, which is not the Google server (2a00:1450:400e:803::200) but a RIPE anchor to which the RIPE probe software running on the machine was sending pings. There are 20 or so occurence of this in the file.

I'm using pycaracal and use something very similar to the examples. Given the seeded PRNG I can generate again the probe specifications I fed to caracal.

#!/usr/bin/env python3

import sys, os, socket
from pycaracal import Probe, prober
import random

if len(sys.argv) < 3:
    print(f"Usage: {sys.argv[0]} hostname csv_output")
    print(-1)

PORT_LO = 50000
PORT_HI = 60000
RAYS = 32
PACKET_PER_PROBE = 4
PROBE_PER_RAY = 50 // PACKET_PER_PROBE
TTL = 127
random.seed("caracal_rays.py")

hostname = sys.argv[1]
try:
    v4_addr = socket.getaddrinfo(hostname, None, family=socket.AF_INET, proto=socket.SOCK_RAW)[0]
    v6_addr = socket.getaddrinfo(hostname, None, family=socket.AF_INET6, proto=socket.SOCK_RAW)[0]
except IndexError:
    print(f"Domain {hostname} does not resolve to both families")

srcports = [random.randrange(PORT_LO, PORT_HI) for _ in range(RAYS)]

probes = [Probe(v4_addr[4][0], srcport, PORT_HI, TTL, "icmp") for _ in range(PROBE_PER_RAY) for srcport in srcports] + \
         [Probe(v6_addr[4][0], srcport, PORT_HI, TTL, "icmp6") for _ in range(PROBE_PER_RAY) for srcport in srcports]

random.shuffle(probes)

config = prober.Config()
config.set_n_packets(PACKET_PER_PROBE)
config.set_output_file_csv(sys.argv[2])
config.set_sniffer_wait_time(10)
config.set_probing_rate(4)
config.set_batch_size(1)
print(prober.probe(config, probes))

@maxmouchet
Copy link
Member

maxmouchet commented Feb 21, 2023

Ok thanks, I see! The issue is that we do not validate echo replies and IPv6 replies.

The way validation (or integrity checking) currently works is:

  1. Generate a random identifier (caracal_id) on start (can also be specified with --caracal-id)
  2. For each probe, compute the caracal checksum as the IP checksum of (caracal_id, ipv4_last_byte, flow_id, ttl)
  3. Encode this checksum in the IPv4 ID field and send the probe
  4. When we get a time exceeded reply, extract the checksum and ipv4_last_byte, flow_id and ttl from the quoted IP packet, and compare it with the expected checksum.
  5. If the checksum doesn't match, discard the reply.

This doesn't work for ICMP Echo Replies. Since the original probe packet is not quoted, we cannot retrieve the checksum field (IP ID). The same issue applies for IPv6 where there is no ID field in the IP header.

This wasn't really an issue for us since we only cared about routers, and not replies from the destination.

I haven't given it more thought, but maybe the checksum can be encoded in the ICMP ID field, which we should be able to retrieve from the Echo Reply. Since caracal embed all its state in the probe packet, we're abusing every header field and there's not much room left for additional information.

A workaround is (obviously :-)) to probe from a machine with no other ping processes.

Probe checksum in https://github.com/dioptra-io/caracal/blob/main/src/probe.cpp:

uint16_t Probe::checksum(uint32_t caracal_id) const noexcept {
  // TODO: IPv6 support? Or just encode the last 32 bits for IPv6?
  return Checksum::caracal_checksum(caracal_id, dst_addr.s6_addr32[3], src_port,
                                    ttl);
}

Reply checksum in https://github.com/dioptra-io/caracal/blob/main/src/reply.cpp:

uint16_t Reply::checksum(uint32_t caracal_id) const {
  // TODO: IPv6 support? Or just encode the last 32 bits for IPv6?
  return Checksum::caracal_checksum(caracal_id, probe_dst_addr.s6_addr32[3],
                                    probe_src_port, probe_ttl);
}

bool Reply::is_valid(uint32_t caracal_id) const {
  // Currently, we only validate IPv4 ICMP time exceeded and destination
  // unreachable messages. We cannot validate echo replies as they do not
  // contain the probe_id field contained in the source IP header.
  // TODO: IPv6 support?
  if (reply_protocol == IPPROTO_ICMP &&
      (reply_icmp_type == 3 || reply_icmp_type == 11)) {
    return probe_id == checksum(caracal_id);
  }
  return true;
}

@SaiedKazemi
Copy link
Member

@mpiraux @maxmouchet Are there any updates to this issue since February? Can we close it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants