Skip to content

Commit

Permalink
Merge pull request #111 from simongog/tutorial
Browse files Browse the repository at this point in the history
Tutorial
  • Loading branch information
simongog committed Sep 23, 2013
2 parents 0b23e5c + dc1fb5a commit 49f5565
Show file tree
Hide file tree
Showing 14 changed files with 76 additions and 21 deletions.
16 changes: 10 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,7 @@ What is it?

The Succinct Data Structure Library (SDSL) is a powerful and flexible C++11
library implementing succinct data structures. In total, the library contains
the highlights of 40 [research publications](https://github.com/simongog/sdsl-lite/wiki/Literature).
Succinct data structures
the highlights of 40 [research publications][SDSLLIT] Succinct data structures
can represent an object (such as a bitvector or a tree) in space close the
information-theoretic lower bound of the object while supporting operations
of the original object efficiently. The theoretical time complexity of an
Expand Down Expand Up @@ -51,7 +50,7 @@ data structure to their full potential.
features provided by the library.
* All data structures are tested for correctness using a unit-testing framework.
* We provide a large collection of supporting documentation consisting of examples,
[cheat sheet][SDSLCS], tutorial slides and walk-through.
[cheat sheet][SDSLCS], [tutorial slides and walk-through][TUT].

The library contains many succinct data structures from the following categories:

Expand Down Expand Up @@ -145,7 +144,7 @@ To compile the program using `g++` run:
g++ -std=c++11 -O3 -I ~/include -L ~/lib program.cpp -o lsdsl
```

Next we suggest you look at the comprehensive [tutorial][TUT] of Simon Gog which describes
Next we suggest you look at the comprehensive [tutorial][TUT] which describes
all major features of the library or look at some of the provided [examples](examples).

Test
Expand Down Expand Up @@ -203,7 +202,9 @@ more information see the COPYING file in the library directory.
Lots of time was spent implementing the many features of the library. If you
use the library in an academic setting please cite the following paper:

_Simon Gog, Matthias Petri: Optimized Succinct Data Structures for Massive Data, Accepted for publication in Software, Practice and Experience_.
Simon Gog, Matthias Petri:
[Optimized Succinct Data Structures for Massive Data][SPE],
Accepted for publication in Software, Practice and Experience.

## External Resources used in SDSL

Expand All @@ -212,7 +213,7 @@ construction algorithms.

* Yuta Mori's incredible fast suffix [libdivsufsort][DIVSUF]
algorithm (version 2.0.1) for byte-alphabets.
* An adapted version of Jesper Larsson's implementation of the
* An adapted version of Jesper Larsson's [implementation][QSUFIMPL] of the
algorithm of [Larson and Sadakane][LS] for integer-alphabets.

Additionally, we use the [googletest][GTEST] framework to provide unit tests.
Expand Down Expand Up @@ -252,3 +253,6 @@ Feel free to contact any of the authors or create an issue on the
[LS]: http://www.sciencedirect.com/science/article/pii/S0304397507005257 "Larson & Sadakane Algorithm"
[GTEST]: https://code.google.com/p/googletest/ "Google C++ Testing Framework"
[SDSLCS]: http://simongog.github.io/assets/data/sdsl-cheatsheet.pdf "SDSL Cheat Sheet"
[SDSLLIT]: https://github.com/simongog/sdsl-lite/wiki/Literature "Succinct Data Structure Literature"
[TUT]: http://simongog.github.io/assets/data/sdsl-slides/tutorial "Tutorial"
[QSUFIMPL]: http://www.larsson.dogma.net/qsufsort.c "Original Qsufsort Implementation"
2 changes: 1 addition & 1 deletion include/sdsl/enc_vector.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ struct enc_vector_trait<64> {
* @ingroup int_vector
*/
template<class t_coder=coder::elias_delta,
uint32_t t_dens = 8, uint8_t t_width=0>
uint32_t t_dens = 128, uint8_t t_width=0>
class enc_vector
{
private:
Expand Down
2 changes: 1 addition & 1 deletion include/sdsl/rrr_vector.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ class select_support_rrr; // in rrr_vector
* In this version the block size can be adjust by the template parameter t_bs!
* \sa sdsl::rrr_vector for a specialized version for block_size=15
*/
template<uint16_t t_bs=15, class t_rac=int_vector<>, uint16_t t_k=32>
template<uint16_t t_bs=63, class t_rac=int_vector<>, uint16_t t_k=32>
class rrr_vector
{
static_assert(t_bs >= 3 and t_bs <= 256 , "rrr_vector: block size t_bs must be 3 <= t_bs <= 256.");
Expand Down
6 changes: 3 additions & 3 deletions include/sdsl/vlc_vector.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -43,11 +43,11 @@ struct vlc_vector_trait<32> {
/*! The values of a vlc_vector are immutable after the constructor call. The class
* could be parametrized with a self-delimiting code t_coder and the sample density.
* \tparam t_coder Type of self-delimiting coder.
* \tparam t_dens Sampling density of stored absolute values.
* \tparam t_width Width of the underlying int_vector for the absolute samples.
* \tparam t_dens Sampling density of pointers into the stream of self-delimiting coded numbers.
* \tparam t_width Width of the underlying int_vector for the pointers.
*/
template<class t_coder = coder::elias_delta,
uint32_t t_dens = 16,
uint32_t t_dens = 128,
uint8_t t_width = 0>
class vlc_vector
{
Expand Down
6 changes: 3 additions & 3 deletions tutorial/expl-02.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,10 @@ using namespace sdsl;

int main()
{
int_vector<> v(10000000);
int_vector<> v(10*(1<<20));
for (size_t i=0; i<10; ++i)
for (size_t j=0; j<1000000; ++j)
v[i*1000000+j] = j;
for (size_t j=0; j < 1U<<20; ++j)
v[i*(1<<20)+j] = j;
cout << size_in_mega_bytes(v) << endl;
util::bit_compress(v);
cout << size_in_mega_bytes(v) << endl;
Expand Down
2 changes: 1 addition & 1 deletion tutorial/expl-03.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ using namespace sdsl;

int main()
{
int_vector<> v(10000000, 3);
int_vector<> v(10*(1<<20), 0);
v[0] = 1ULL<<63;
util::bit_compress(v);
cout << size_in_mega_bytes(v) << endl;
Expand Down
2 changes: 1 addition & 1 deletion tutorial/expl-04.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ int main()
{
bit_vector b = {1,1,0,1,0,0,1};
cout << b << endl;
b = bit_vector(80000000, 0);
b = bit_vector(80*(1<<20), 0);
for (size_t i=0; i < b.size(); i+=100)
b[i] = 1;
cout << size_in_mega_bytes(b) << endl;
Expand Down
2 changes: 1 addition & 1 deletion tutorial/expl-05.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ using namespace sdsl;

int main()
{
bit_vector b = bit_vector(80000000, 0);
bit_vector b = bit_vector(80*(1<<20), 0);
for (size_t i=0; i < b.size(); i+=100)
b[i] = 1;
cout << size_in_mega_bytes(b) << endl;
Expand Down
2 changes: 1 addition & 1 deletion tutorial/expl-06.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ using namespace sdsl;

int main()
{
bit_vector b = bit_vector(80000000, 0);
bit_vector b = bit_vector(80*(1<<20), 0);
for (size_t i=0; i < b.size(); i+=100)
b[i] = 1;
sd_vector<> sdb(b);
Expand Down
2 changes: 1 addition & 1 deletion tutorial/expl-14.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ using namespace sdsl;
int main()
{
wt_hutu<rrr_vector<63>> wt;
construct_im(wt, "ハローワールド!", 1);
construct_im(wt, "こんにちは世界", 1);
for (size_t i=0; i < wt.size(); ++i)
cout << wt[i];
cout << endl;
Expand Down
4 changes: 2 additions & 2 deletions tutorial/expl-23.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,11 @@ using namespace std;

int main()
{
cst_sct3<csa_wt<>, lcp_support_sada<>> cst1;
cst_sct3<csa_wt<wt_huff<rrr_vector<>>>, lcp_support_sada<>> cst1;
construct(cst1, "english.200MB", 1);
cout << "cst1.lcp in MiB : " << size_in_mega_bytes(cst1.lcp) << endl;
util::clear(cst1);
cst_sct3<csa_wt<>, lcp_dac<>> cst2;
cst_sct3<csa_wt<wt_huff<rrr_vector<>>>, lcp_dac<>> cst2;
construct(cst2, "english.200MB", 1);
cout << "cst2.lcp in MiB : " << size_in_mega_bytes(cst2.lcp) << endl;
}
17 changes: 17 additions & 0 deletions tutorial/expl-24.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#include <sdsl/bp_support.hpp>
#include <iostream>

using namespace std;
using namespace sdsl;

int main()
{
// 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
// ( ( ( ) ( ) ) ( ) ( ( ( ) ( ) ) ( ) ) )
bit_vector b = {1,1,1,0,1,0,0,1,0,1,1,1,0,1,0,0,1,0,0,0};
bp_support_sada<> bps(&b); // <- pointer to b
cout << bps.find_close(0) << ", "
<< bps.find_open(3) << ", "
<< bps.enclose(4) << ", "
<< bps.double_enclose(13, 16) << endl;
}
19 changes: 19 additions & 0 deletions tutorial/expl-25.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@

#include <sdsl/bp_support.hpp>
#include <iostream>

using namespace std;
using namespace sdsl;

int main()
{
// 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
// ( ( ( ) ( ) ) ( ) ( ( ( ) ( ) ) ( ) ) )
bit_vector b = {1,1,1,0,1,0,0,1,0,1,1,1,0,1,0,0,1,0,0,0};
bp_support_sada<> bps(&b); // <- pointer to b
for (size_t i=0; i < b.size(); ++i)
cout << bps.excess(i)<< " ";
cout << endl;
cout << bps.rank(0) << ", " // inclusive rank for BPS!!!
<< bps.select(4) << endl;
}
15 changes: 15 additions & 0 deletions tutorial/expl-26.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#include <sdsl/rmq_support.hpp>
#include <iostream>

using namespace std;
using namespace sdsl;

int main()
{
// 0 1 2 3 4 5 6 7 8 9 0
int_vector<> v = {5,3,8,9,1,2,5,3,9,0,7};
rmq_succinct_sct<> rmq(&v); // <- pointer to b
util::clear(v);
cout << "v.size() = " << v.size() << endl;
cout << rmq(0, 10) << ", " << rmq(2, 7) << endl;
}

0 comments on commit 49f5565

Please sign in to comment.