-
Notifications
You must be signed in to change notification settings - Fork 6
How to set up capture times
In the ITHI project, the metrics M4 M6 and M8 aim at understanding DNS usage. They are computed by aggregating capture from multiple sources. Each of these sources is a network operator, managing a DNS recursive resolver. If you have volunteered to be one of these sources, thank you! Hopefully, the following rules will help you setting up the capture in a useful way:
-
Set up a capture script to run once a day on each selected instance
-
Pick a different time each day to run the script
-
Each of these capture sessions should try capture one million transactions.
These rules are designed to minimize statistical bias. We are concerned with three kinds of bias:
-
Internet usage is probably different in different locales. For example, people in Germany are significantly more likely to access names in the ".de" domain than people in India.
-
Internet usage is different when people are working and when they are just enjoying the Internet. For example, we are more likely to see erroneous request for the ".corp" string from people at work than from people at home.
-
Internet usage may vary based on the day of the week. For example, people are more likely to watch movies or ballgames during weekends than during week days.
We are addressing the locality bias by recruiting volunteers in many countries and regions. But in each of these volunteers to capture as much as possible the variations between different hours of the day (working or leisure) and different days of the week (workdays and weekends).
When you are writing the capture script, create files that identify the organization, the server instance and the date of the capture, such as "example-nw17-2018-04-01.csv". The syntax of capture scripts is defined in another page on this wiki.
Capture scripts are typically launched using cron
on the capture server, using a crontab
entry for each day of the week, such as for example:
# Run at 9:05 am on Monday
5 9 * * 1 local/capture-script.sh
# Run at 2:05 pm on Tuesday
5 14 * * 2 local/capture-script.sh
# Run at 7:05 pm on Wednesday
5 19 * * 3 local/capture-script.sh
# Run at 12:05 am on Thursday
5 0 * * 4 local/capture-script.sh
# Run at 5:05 am on Friday
5 5 * * 5 local/capture-script.sh
# Run at 10:05 am on Saturday
5 10 * * 6 local/capture-script.sh
# Run at 3:05 pm on Sunday
5 15 * * 7 local/capture-script.sh
The string local/capture-script.sh
in this example should of course be replaced by the actual location of the script on the server.
When an organization performs captures on several server instances, it should has much as possible use different capture times on different servers.