Skip to content
This repository has been archived by the owner on Feb 13, 2024. It is now read-only.

Can you share the ~600GB graph database in Torrent? #10

Open
brunoaduarte opened this issue Dec 27, 2018 · 21 comments
Open

Can you share the ~600GB graph database in Torrent? #10

brunoaduarte opened this issue Dec 27, 2018 · 21 comments

Comments

@brunoaduarte
Copy link
Contributor

The resulting Neo4j database is roughly 6x the size of the blockchain. So if the blockchain is 100GB, your Neo4j database will be 600GB.
It may take 60+ days to finish importing the entire blockchain.

Can you share the ~600GB graph database in Torrent?

@noff
Copy link

noff commented Jan 3, 2019

Up

@in3rsha
Copy link
Owner

in3rsha commented Jan 6, 2019

I tried setting up a torrent before but couldn't it working for some reason and haven't tried again since. What would be the use? Would have to dedicate some time to setting up and hosting the torrent (and figuring out why it didn't work last time, ha).

I have the database running on a server that is accessible through the web browser, if that's any use.

@noff
Copy link

noff commented Jan 7, 2019

I'm trying to make a search for fraudulent transactions.
I think I can host the torrent if you can share the actual database. Because my server will import BTC for a few months.
Can you make a backup and share it somewhere? For example, I can give you an SSH for test server so you will able to upload it.

@in3rsha
Copy link
Owner

in3rsha commented Jan 7, 2019

Okay I see. The database is about 1TB so I think a torrent would be the best way to share it. If I can find some free time I'll look in to setting up a torrent.

@noff
Copy link

noff commented Jan 14, 2019

Maybe I can assist you to configure torrent to save your time?

@brunoaduarte
Copy link
Contributor Author

brunoaduarte commented Feb 6, 2019

I did some testing here on the "data/databases/graph.db/neostore.transaction.db.X" files. Using .RAR i got a 15% compression ratio. So the 1TB data compressed to RAR will have a final size of around 150~200 GB.

Creating a new Torrent is very easy, just download and install uTorrent Classic
Press CTRL+N to open the new torrent dialog, select the compressed database RAR file and click CREATE. It will imediatelly start seeding. Then you just copy the magnet URI and paste it here :)

For example, this is the one i've just created

image

magnet:?xt=urn:btih:CECCD44A424A6F541373C38D90300DAD68A16A4E&dn=graph.db.rar&tr=udp%3a%2f%2ftracker.opentrackr.org%3a1337%2fannounce&tr=udp%3a%2f%2ftracker.openbittorrent.com%3a80%2fannounce

In the meantime i'm running "bitcoin-to-neo4j" on my blocks folder. 12 hours and i'm on height 108,651 / 561,747. As per your 60-day estimative, i guess importing speed will slowdown exponentially, right?

Thanks

@noff
Copy link

noff commented Feb 6, 2019

Yes, I’m importing it 2 months and now on height 252000.

@brunoaduarte
Copy link
Contributor Author

Yes, I’m importing it 2 months and now on height 252000.

Which size already ?

@noff
Copy link

noff commented Feb 6, 2019

Only 66Gb.

@brunoaduarte
Copy link
Contributor Author

Only 66Gb.

@noff , could you run this command that will show how long each of the .dat files took to be processed and paste the result here please?

redis-cli hgetall bitcoin-to-neo4j:log

@noff
Copy link

noff commented Feb 7, 2019

  2) "[79836] 974.81 mins"
  3) "blk00001.dat"
  4) "[11249] 594.20 mins"
  5) "blk00002.dat"
  6) "[5706] 583.93 mins"
  7) "blk00003.dat"
  8) "[5712] 612.38 mins"
  9) "blk00004.dat"
 10) "[6399] 659.58 mins"
 11) "blk00005.dat"
 12) "[7457] 653.84 mins"
 13) "blk00006.dat"
 14) "[7236] 615.13 mins"
 15) "blk00007.dat"
 16) "[6210] 598.06 mins"
 17) "blk00008.dat"
 18) "[6145] 616.54 mins"
 19) "blk00009.dat"
 20) "[3954] 634.31 mins"
 21) "blk00010.dat"
 22) "[1513] 658.11 mins"
 23) "blk00011.dat"
 24) "[1544] 520.79 mins"
 25) "blk00012.dat"
 26) "[1377] 460.50 mins"
 27) "blk00013.dat"
 28) "[1079] 588.11 mins"
 29) "blk00014.dat"
 30) "[1797] 650.83 mins"
 31) "blk00015.dat"
 32) "[1856] 648.95 mins"
 33) "blk00016.dat"
 34) "[1393] 636.46 mins"
 35) "blk00017.dat"
 36) "[1547] 666.84 mins"
 37) "blk00018.dat"
 38) "[1534] 724.78 mins"
 39) "blk00019.dat"
 40) "[1188] 685.00 mins"
 41) "blk00020.dat"
 42) "[1530] 726.96 mins"
 43) "blk00021.dat"
 44) "[1333] 668.53 mins"
 45) "blk00022.dat"
 46) "[1510] 644.35 mins"
 47) "blk00023.dat"
 48) "[1600] 513.11 mins"
 49) "blk00024.dat"
 50) "[1389] 494.09 mins"
 51) "blk00025.dat"
 52) "[1341] 641.53 mins"
 53) "blk00026.dat"
 54) "[1281] 570.26 mins"
 55) "blk00027.dat"
 56) "[1767] 548.89 mins"
 57) "blk00028.dat"
 58) "[1439] 607.54 mins"
 59) "blk00029.dat"
 60) "[1193] 612.29 mins"
 61) "blk00030.dat"
 62) "[1369] 614.33 mins"
 63) "blk00031.dat"
 64) "[1177] 595.22 mins"
 65) "blk00032.dat"
 66) "[923] 517.29 mins"
 67) "blk00033.dat"
 68) "[465] 304.85 mins"
 69) "blk00034.dat"
 70) "[1187] 607.02 mins"
 71) "blk00035.dat"
 72) "[1064] 616.63 mins"
 73) "blk00036.dat"
 74) "[820] 616.93 mins"
 75) "blk00037.dat"
 76) "[829] 558.56 mins"
 77) "blk00038.dat"
 78) "[848] 549.05 mins"
 79) "blk00039.dat"
 80) "[890] 516.38 mins"
 81) "blk00040.dat"
 82) "[873] 628.13 mins"
 83) "blk00041.dat"
 84) "[796] 634.02 mins"
 85) "blk00042.dat"
 86) "[954] 661.90 mins"
 87) "blk00043.dat"
 88) "[857] 562.21 mins"
 89) "blk00044.dat"
 90) "[829] 535.56 mins"
 91) "blk00045.dat"
 92) "[762] 530.25 mins"
 93) "blk00046.dat"
 94) "[753] 527.93 mins"
 95) "blk00047.dat"
 96) "[786] 540.58 mins"
 97) "blk00048.dat"
 98) "[1197] 533.25 mins"
 99) "blk00049.dat"
100) "[960] 474.44 mins"
101) "blk00050.dat"
102) "[739] 457.02 mins"
103) "blk00051.dat"
104) "[796] 481.15 mins"
105) "blk00052.dat"
106) "[717] 499.94 mins"
107) "blk00053.dat"
108) "[746] 562.37 mins"
109) "blk00054.dat"
110) "[809] 576.79 mins"
111) "blk00055.dat"
112) "[844] 583.04 mins"
113) "blk00056.dat"
114) "[814] 532.44 mins"
115) "blk00057.dat"
116) "[777] 509.30 mins"
117) "blk00058.dat"
118) "[838] 504.36 mins"
119) "blk00059.dat"
120) "[726] 515.63 mins"
121) "blk00060.dat"
122) "[684] 508.69 mins"
123) "blk00061.dat"
124) "[815] 520.18 mins"
125) "blk00062.dat"
126) "[878] 509.49 mins"
127) "blk00063.dat"
128) "[922] 513.50 mins"
129) "blk00064.dat"
130) "[985] 510.51 mins"
131) "blk00065.dat"
132) "[1095] 562.51 mins"
133) "blk00066.dat"
134) "[1058] 545.18 mins"
135) "blk00067.dat"
136) "[1055] 594.48 mins"
137) "blk00068.dat"
138) "[740] 426.48 mins"
139) "blk00069.dat"
140) "[520] 447.56 mins"
141) "blk00070.dat"
142) "[1170] 909.89 mins"
143) "blk00071.dat"
144) "[1271] 901.13 mins"
145) "blk00072.dat"
146) "[1195] 892.13 mins"
147) "blk00073.dat"
148) "[1094] 906.22 mins"
149) "blk00074.dat"
150) "[1160] 936.58 mins"
151) "blk00075.dat"
152) "[890] 1,011.22 mins"
153) "blk00076.dat"
154) "[918] 1,355.98 mins"
155) "blk00077.dat"
156) "[888] 1,182.47 mins"
157) "blk00078.dat"
158) "[1135] 1,253.56 mins"
159) "blk00079.dat"
160) "[968] 1,759.61 mins"
161) "blk00080.dat"
162) "[1166] 1,879.71 mins"```

@brunoaduarte
Copy link
Contributor Author

You said you're running this 2 months now and it imported 80 of the 1518 blk files
If it keep this average importing speed (which it will probably not) it will take what, ~4 years to finish ?

@noff
Copy link

noff commented Feb 7, 2019

Looks like this.
We are now digging in fast-import direction by CSV.

@brunoaduarte
Copy link
Contributor Author

brunoaduarte commented Feb 7, 2019

@noff 's log

  1. "blk00000.dat" "[79836] 974.81 mins"
  2. "blk00001.dat" "[11249] 594.20 mins"
  3. "blk00002.dat" "[5706] 583.93 mins"
  4. "blk00003.dat" "[5712] 612.38 mins"
  5. "blk00004.dat" "[6399] 659.58 mins"
  6. "blk00005.dat" "[7457] 653.84 mins"
  7. "blk00006.dat" "[7236] 615.13 mins"

Here's my log, HDD is really very slow, but with SSD things speed up a lot...
(HDD)

  1. "blk00000.dat" "[119965] 757.15 mins"
  2. "blk00001.dat" "[11259] 480.65 mins"
  3. "blk00002.dat" "[1473] ------- mins" (importing restarted)
  4. "blk00003.dat" "[5726] 540.56 mins"
  5. "blk00004.dat" "[6392] 606.81 mins"
  6. "blk00005.dat" "[7479] 595.06 mins"
  7. "blk00006.dat" "[7214] 573.46 mins"

(SSD)

  1. "blk00000.dat" "[119965] 445.19 mins"
  2. "blk00001.dat" "[11259] 302.57 mins"
  3. "blk00002.dat" "[5697] 293.72 mins"

@noff
Copy link

noff commented Feb 8, 2019

We are working now on fast import of initial data via CSV. It can be a solution.
I’ve found Go script which imports all blockchain into PostgreSQL during 24 hours. We will use this approach.

@brunoaduarte
Copy link
Contributor Author

brunoaduarte commented Feb 8, 2019

We are working now on fast import of initial data via CSV. It can be a solution.
I’ve found Go script which imports all blockchain into PostgreSQL during 24 hours. We will use this approach.

Great! Please lets us know how that this develops. Thanks!

@jackenbaer
Copy link

Torrent or fast CSV import would be great. I've got the same problem...

@jackenbaer
Copy link

As an alternative i found a project that seems to be faster for the initial download...
https://github.com/straumat/blockchain2graph
After 4 days of parsing (using SSD) i am on block height 328 000 (~310 GB).
I recommend using the docker file.
@in3rsha Sorry for advertising other projects in the comment section of your project --> I bought you a beer (3Beer3irc1vgs76ENA4coqsEQpGZeM5CTd)

@arisjr
Copy link

arisjr commented Aug 26, 2019

@Nehberg, the use blockchain2graph project lead me to a >3TB database size in weeks. His schema leads to a database much bigger, while using the same input. It grows faster, among other things, because of its database schema, I think. I don't have much knowledge in neo4j, but the properties strings db, was more than half of the database size.

Bitcoin-to-neo4j schema is by far smaller, and, as you can see in the browser of this project, faster and efficient.

I would stick to Greg's project and neo4j schema, but using the CSV way. There are implementations that you can google, that was based on Greg's work, that managed to import the blockchain in one day.

@daniel10012
Copy link

Hello guys,
Has someone figured out a way to import the blockchain into CSVs (either by parsing directly or with JSON RPC) in a format that respect Greg's schema?
Can't figure it out, thanks !
Greg, absolutely amazing work and great learnmeabitcoin website, very informative thank you.

Repository owner deleted a comment from xingyushu Oct 19, 2020
@bbm-design
Copy link

bbm-design commented Dec 30, 2020

Can someone seed for a moment please ?
i'll keep the seed open after

Could this project be used for importing the history through csv ? https://github.com/behas/bitcoingraph (sorry for advertising other projects)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants