Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: better organize source test with different encodings #16931

Closed
1 of 2 tasks
xxchan opened this issue May 24, 2024 · 1 comment
Closed
1 of 2 tasks

test: better organize source test with different encodings #16931

xxchan opened this issue May 24, 2024 · 1 comment
Assignees
Milestone

Comments

@xxchan
Copy link
Member

xxchan commented May 24, 2024


Currently test data looks like this:

image

Besides, there's actually another folder: src/connector/src/test_data

It's way too messy:

  • The format has different conventions.
  • We don't know how they are ingested. (There are multiple commands/scripts!)
  • Actually, the filename has a convention, and will decide the behavior.
    for filename in $kafka_data_files; do
    ([ -e "$filename" ]
    base=$(basename "$filename")
    topic="${base%%.*}"
    echo "Fulfill kafka topic $topic with data from $base"
    # binary data, one message a file, filename/topic ends with "bin"
    if [[ "$topic" = *bin ]]; then
    kcat -P -b message_queue:29092 -t "$topic" "$filename"
    elif [[ "$topic" = *avro_json ]]; then
    python3 source/schema_registry_producer.py "message_queue:29092" "http://message_queue:8081" "$filename" "topic" "avro"
    elif [[ "$topic" = *json_schema ]]; then
    python3 source/schema_registry_producer.py "kafka:9093" "http://schemaregistry:8082" "$filename" "topic" "json"
    else
    cat "$filename" | kcat -P -K ^ -b message_queue:29092 -t "$topic"
    fi
    ) &
    done

The result is that even experienced developers (@xiangjinwu) need to relearn the convention each time. However, we are still developing many connector features, and the difficulty of adding new tests will hinder development.

Some ideas for improvement:

  • For the very basic requirement, we should group test data into folders by how they are used, instead of relying on filename suffix (and has a README for how to use them..)
@xxchan
Copy link
Member Author

xxchan commented May 31, 2024

With #17002, we can simply do more unit tests instead of e2e tests.

We can also develop web apps https://risingwave-labs.slack.com/archives/C03A2PSS8KU/p1717127312643039

We can also do fuzz tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant