Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge doc dir with meeting minutes to master #3

Open
wants to merge 17 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added doc/images/sampleapp_datamodel.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
44 changes: 44 additions & 0 deletions doc/meetingminutes/2018-04-26_meeting minutes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Meeting Minutes

26.04.2018

Participants:

- Sebastian Schmidl
- Frederic Schneider

## Agenda

1. Recap individual work
2. Sync understanding of _Manifesto_
- Defines new DB type: _Actor Database System_
- just a vision with a small prototype
- proposes tenets and needed features
3. Decide direction of interface and the whole project
- see below
4. Scala Resources
- Web-Scala-REPL: https://scastie.scala-lang.org/
- https://www.scala-exercises.org/
- Akka-Referenz: https://doc.akka.io/docs/akka/current/index-actors.html?language=scala
4. Next tasks
- Sebi: create doc folder structure & upload meeting notes
- Sebi: create issue labels
- Sebi: create Hello-World in akka
- Sebi: intro to akka
- Fred: intro in scala, akka
5. Next meeting
- Montag, 30.04.2018 9:15am, on-campus

## Decisions

- application and data in one tier
- Design of an Actor Database Framework
- like _Domain Driven Design_: splits domain/_schema_ into domain entities as actors
- provides persistence, domain actors, ways to declare state relations and methods
- Interface: Framework provides `Interface` for domain actors (`DomainActor`) to define relations and methods (see _Actor Database Systems: A Manifesto_), asynchronous messaging between domain actors, state is stored in child actors (`SKActor`) of those domain actors
- example: https://scastie.scala-lang.org/CodeLionX/kSvVknaOTYiqdUPoe4nEOQ
- first: data only in-memory, relational schemas (no KV-store)

- Example Application
- uses Actor Database Framework
- [Police Data Model](http://www.databaseanswers.org/data_models/police_canonical_data_model/index.htm)
52 changes: 52 additions & 0 deletions doc/meetingminutes/2018-04-30_meeting minutes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Meeting Minutes

30.04.2018

Participants:

- Sebastian Schmidl
- Frederic Schneider

## Agenda

1. Scala Intro and IDE setup
- look into book _Programming Scala_ (O'Reilly)
2. Sync of understanding of paper _Reactors: A Case for Predictable, Virtualized Actor Database Systems_
3. Recap tasks
- [x] Sebi: create doc folder structure & upload meeting notes
- [x] Sebi: create issue labels
- [x] Sebi: create Hello-World in akka
- [x] Sebi: intro to akka
- [x] Fred: intro in scala, akka (scala, but not akka)
4. Work on Project (see [below](#Decisions))
5. Next tasks
- [ ] Fred: intro in akka
- [ ] implement in-memory data store using defined _Relation_ semantics below
- [ ] develop a way to declare a `Relation`, which uses the data store
- [ ] `Relation` should provide default SQL methods: select, update, delete, insert
- [ ] there are two kinds of `Relation`: `ColumnStoreRelation` and `RowStoreRelation`
6. Next meeting
- Montag, 30.04.2018 9:15am, on-campus

## Decisions

- We change example from _Police Data Model_ to _Shopping Cart_ from manifest paper
- better communication to other teams
- already decomposed to domain entities

![Shopping Cart Model](../images/sampleapp_datamodel.png)

- Paper _Reactors: A Case for Predictable, Virtualized Actor Database Systems_ provides good insights for implementing our Framework idea
- We want to consider following concepts:
1.

> A reactor is an application-defined logical actor that encapsulates state abstracted using relations.
>
> _Reactor_ paper

2. Computations across logical actors have transactional guarantees. _We have to figure out which_.

- _Relation_: How to store relations in-memory?
- is a named set of tuples
- [StackOverflow post about typed map with different types](https://stackoverflow.com/questions/17684023/different-types-in-map-scala)
- implementation ideas: https://hpi.slack.com/files/U7FTCE45N/FAF3HEE3B/image_uploaded_from_ios.jpg
93 changes: 93 additions & 0 deletions doc/meetingminutes/2018-05-09_meeting minutes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# Meeting Minutes

09.05.2018

Participants:

- Sebastian Schmidl
- Frederic Schneider

## Agenda

1. Open Operations/Issues on `Relation` and corresponding interfaces
2. Perspective interface and name for abstract actors
3. Focus of this project and main topics of interest
4. Next TODOs with deadlines

## Decisions

1. For point 1:

- Use `Set(columnDefs)` for defining relations and records instead of `Seq(columnDefs)` to prohibit duplicate column names.
One issue remains unfixed: It's possible to create two `ColumnDef`s with the same name and different `valueTypes`.
- Keep name `project` for projecting relations and records to other schemas (like SQL _SELECT_)
- Needed features:
- Capability to create primary keys, uniqueness constraint and accessor: `get(key: Int): Record`.
Keys are always `Int`. Primary key generation possible.
- Tests for all public interfaces
- Where-Condition-Builder is optional
- improve `RecordBuilder` with better type-mismatch error messages
- We don't allow joins on actor-internal relations
- ResultSets of record operations are represented as `Seq[Record]`
- `Relation` returns `Seq[Record]`, but for chaining `where`s and `project`s it will be more useful to always return a new `Relation` or a similar construct.
We decided to keep it as it is, and postpone a decision if we need one.

2. For point 2:

- We define one or more super classes for application Actors,
which is responsible for all framework related tasks.
- example use code:

```scala
class MyUserActor(userId: Int) extends Dactor(userId) {
// actor name is build by Dactor
object UserDetails extends RowRelation {
val nameCol: ColumnDef[String] = ColumnDef("name")
...
override val columnDefs = Seq(nameCol, ...)
}
override val relations = Map("UserDetails" -> UserDetails)

override receive: Receive = {
case SomeRequest(msg: String) => UserDetails.insert(UserDetails.newRecord(...))
}
}
```
- change `Relation` and subclasses to be used as `RowRelationDef` from the above example and get rid of the `R` val

3. Focus and Topics

- we will not provide an SQL-like interface
- we will be dealing with **asynchronous calls** any way
- the interface to the outside will be implemented by the application developer as we combine application logic and data storing in one tier (Actors)
- calls between Actors are asynchronous, but are wrapped in future-like constructs to allow synchronization for application devs.
- we could image to create a showcase for a HTTP-Server (which is synchronous) for outbound communication
- **Multi-table operations**
- highly depend on the domain layout declared by the application dev.
- again, there is the possibility to create a showcase
- **Partitioning**
- our concept of distributing data and application logic to domain-level actors is a type of partitioning
- one actor contains a subset of all data of a certain relation/table, bsp:
Customer Actor 1 contains personal details of customer 1 and their transactions,
Customer Actor 2 contains personal details of customer 2 and their transactions, ...
- **Access Control**
- What is Access Control? Where will it be declared? Where is it enforced? What is the granularity?
- `User`, `Group`, `AuthenticationActor`, access rights, access control matrix or other method?
- maybe transactions between domain-level actors
- outside API of our framework will be **asynchronous**
- we will focus on **improving throughput** instead of enforcing consistency

4. TODOs:

- (finished) Sebi: Finish `project()` functionality and tests for `Record`
- (09.05.18) Sebi: implement better `RecordBuilder`
- (10.05.18) Sebi: refactor `Relation` classes (see point 2)
- (14.05.18) Sebi & Fred: `Dactor`- interface for defining domain actors
- (09.05.18) Fred: review PRs: #13, #23
- (10.05.18) Fred: `Seq(columnDefs)` -> `Set(columnDefs)` refactoring, think about issue: _same column name different type_
- (10.05.18) Fred: Write all test without those for `Record` and mark `ColumnRelation` as _depricated_
- (11.05.18) Fred: Check in `Relation` inserted `Record`s for correct columns (issue #10)

# Next Meeting

Monday, 14.05.2018
51 changes: 51 additions & 0 deletions doc/meetingminutes/2018-05-16_meeting minutes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Meeting Minutes

16.05.2018

Participants:

- Sebastian Schmidl
- Thorsten Papenbrock
- Other Teams

## Agenda

1. Feedback and Remarks to Project Status

- Persistence: our decision
- we don't have to support it
- it is taken into consideration if we have persistence
- Actor-Model: Could be interesting to find out overhead
- overhead of actors memory-wise
- load lots of data into memory in a normal Java/Scala-class (naive impl.)
- compare to loading the same data into memory distributed among actors
- Actor-Distribution
- physical location of actors
- load balancing, actor migration, message routing
- maybe: new layer between akka remoting and application dev.
- could use own cost model (basic one: random, round-robin)
- could provide a meta-actor-pool (akka actor-shading)
- Access Control
- interesting: row- or column-access-control
- there are a lot of rules and implementing these is easy
- granularity: what makes sense? do we even need to look at it?
- What challenges are interesting research instead of simple implementations?

2. Mit-term presentation

- on 11.06.2018 (alternative: 13.06.2018)
- Thorsten will write an e-mail with details
- with other researchers of the _Informationsystems_ chair
- topics and content:
- goal of our project
- assumptions and things we intentionally left out
- core challenge we are working on and why it is difficult
- possible solutions
- features, diagrams, What does our software do?
- goal of the presentation:
- get feedback
- insights into possible problems and solutions

# Next Meeting

open (tbd)
36 changes: 36 additions & 0 deletions doc/meetingminutes/2018-05-19_meeting minutes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Meeting Minutes

19.05.2018

Participants:

- Sebastian Schmidl
- Frederic Schneider

## Agenda

1. Sync-up for group and supervisor feedback from 16.05.2018

- **Persistence** is not a central aspect for the project so we will keep it **on hold** for now
- **Access control** was deemed not too interesting as a research topic by supervisor
- There is **interest** in the memory **overhead** of our approach as well as in how to handle actor **distribution**

2. Self-management and organisation

- We keep focusing too much on SE related optimizations and improvements to the framework interfaces
- From now on we will have to **focus on research project's interests**, i.e. implications of the actor model and *Akka* usage for the DB

3. Short- and Mid-term goals and respective tasks definition

- *Short-term:* Finish the sample applications predefined message patterns / functions:
- `add_items`
- `checkout`
- `get_variable_discount_update_inventory`
- *Short-term:* `insert` message Requests for all `Relation`s of the sample application's `Dactor`s
- *Short-term:* System startup, is a routine run from the `Main` that does all relevant *Akka* setup and possibly any setup necessary for our framework
- *Short-term:* Test (actor) that statically creates some content for all `Dactor` extending classes in the sample application and then queries some data back to test that the messaging is working appropriately
- *Mid-term:* `TestDataInitializer` actor that reads data from a `csv` in some format and initializes `Dactor` contents based on the test data.

## Next Meeting

23.05.2018: We will have a short meeting no Wednesday to accomodate Sebastians busy schedule, keep each other up to date and decide on further actions if we achieved our short-term goal until then.
28 changes: 28 additions & 0 deletions doc/meetingminutes/2018-05-28_meeting minutes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Meeting Minutes

28.05.2018

Participants:

- Sebastian Schmidl
- Frederic Schneider
- Thorsten Papenbrock
- other teams

## Agenda

1. Group-Sync, Feedback

- ask-pattern is bad :D
--> find a way to replace it
- we may want to encapsulate the LUTs for request in actor state

2. Mid-term presentation

- on 21.06.2018 9:15, HPI Campus II, Building F
- in english
- find order of groups

## Next Meeting

tbd
46 changes: 46 additions & 0 deletions doc/meetingminutes/2018-06-13_meeting minutes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Meeting Minutes

13.06.2018

Participants:

- Sebastian Schmidl
- Frederic Schneider

## Agenda

1. Sync-up:
- Sebastian is working the DataInitializer and has to work around #84: `DataInitializer` cannot be mixed into direct `Dactor` subclass due to linearization issue. **Workaround:** Dactor subclass hierarchy `Dactor <- MyDactorBase <- MyDactor with DataInitialization`
- Frederic has implemented `StoreSection` for `sampleapp`, only small changes needed to merge.
2. Next tasks:
- Sebastian:
- Finish PR #86
- Data initialization in `SystemTest` using `DataInitialization` and `csv` seed data instead of hardcoded data initialization
- [issue #62](https://github.com/CodeLionX/actordb/issues/62): Definition package refactoring
- Benchmark data generation script
- Frederic:
- Finish PR #83
- Add full functional test to `SystemTest`, i.e. provision some data, `addItems` as a Customer and then `checkout`
- [issue #68](https://github.com/CodeLionX/actordb/issues/68): Make all successful messages return `Relation`s
- [issue #76](https://github.com/CodeLionX/actordb/issues/76): Add test for `Dactor` companion object
- **Memory overhead measurement related:** Add test that creates one big `Relation` for each `Relation` type and initializes its data from the same `csv` seed files as `DataInitializer` (maybe some of its functionality can even be reused in part)
- **Memory overhead measurement related:** Load all data from `csv` seed files into memory naively e.g. simply as a `String`
- Read resources for *Spider Pattern*
- Read resources for `akka-cluster` and `akka-cluster-sharding` (see: [issue #84](https://github.com/CodeLionX/actordb/issues/84))
- Roadmap:
- See [meeting minutes from 19.05.2018](https://github.com/CodeLionX/actordb/blob/doc/meetingminutes/doc/meetingminutes/2018-05-19_meeting%20minutes.md): Interest in **memory overhead and distribution**:
- Find appropriate profiling method to measure **memory overhead**
- Add **distribution** (see: [issue #84](https://github.com/CodeLionX/actordb/issues/84))
3. Meeting with Thorsten
- Describe state of our project
- Memory overhead measurement: naive vs. `Relation` only vs. `Dactor`s - this has not been published / done by manifesto paper and could be interesting
- **Spider Pattern** for debugging akka applications could be used for tasks like *query provisioning*, investigating data sources

## Decisions

- We are using the same seed data format for `DataInitializer` and for our benchmark test seeding
- The seed data format is comprised of directories for each `Dactor` instance containing files for each of their `Relation`s contents. We use this format even for the benchmark tests that do not instantiate any `Dactor`s - they instead add an extra column to their Relation containing the `Dactor` name corresponding to a given `Record`.

## Next meeting

**tbd** depending on meeting time with Thorsten
Loading