Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: log ingestion support #4014

Merged
merged 64 commits into from
Jun 14, 2024
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
63491d9
chore: add log http ingester scaffold
paomian May 22, 2024
4d2ec3b
chore: add some example code
paomian May 22, 2024
2e51c16
chore: add log inserter
paomian May 27, 2024
fbc66ec
chore: add log handler file
paomian May 27, 2024
cd4d83d
chore: add pipeline lib
paomian May 27, 2024
2bc1937
chore: import log handler
paomian May 29, 2024
2b16ef9
chore: add pipelime http handler
paomian May 29, 2024
f1350cd
chore: add pipeline private table
paomian May 30, 2024
1d52cad
chore: add pipeline API
paomian May 31, 2024
8c69abb
chore: improve error handling
paomian May 31, 2024
7e0a9ad
Merge branch 'main' into feat/log-handler
shuiyisong Jun 3, 2024
73432dc
chore: merge main
shuiyisong Jun 3, 2024
9d7284c
Merge pull request #6 from shuiyisong/chore/merge_main
paomian Jun 3, 2024
1a03b7e
chore: add multi content type support for log handler
paomian Jun 3, 2024
a2f1230
Merge branch 'main' into feat/log-handler
shuiyisong Jun 4, 2024
6a0998d
refactor: remove servers dep on pipeline
shuiyisong Jun 3, 2024
443eaf9
refactor: move define_into_tonic_status to common-error
shuiyisong Jun 3, 2024
c8ce4ee
refactor: bring in pipeline 3eb890c551b8d7f60c4491fcfec18966e2b210a4
shuiyisong Jun 4, 2024
eb9cd22
chore: fix typo
shuiyisong Jun 4, 2024
8d0595c
refactor: bring in pipeline a95c9767d7056ab01dd8ca5fa1214456c6ffc72c
shuiyisong Jun 4, 2024
061b14e
chore: fix typo and license header
shuiyisong Jun 4, 2024
c152472
refactor: move http event handler to a separate file
shuiyisong Jun 4, 2024
ddea3c1
chore: add test for pipeline
paomian Jun 4, 2024
162e92f
Merge branch 'main' into feat/log-handler
shuiyisong Jun 4, 2024
5a7a5be
chore: update
shuiyisong Jun 4, 2024
423e51e
chore: fmt
shuiyisong Jun 4, 2024
51df233
Merge pull request #7 from shuiyisong/refactor/log_handler
paomian Jun 4, 2024
8066eb3
refactor: bring in pipeline 7d2402701877901871dd1294a65ac937605a6a93
shuiyisong Jun 4, 2024
e2a2e50
refactor: move `pipeline_operator` to `pipeline` crate
shuiyisong Jun 4, 2024
209a1a3
chore: minor update
shuiyisong Jun 4, 2024
c110adb
refactor: bring in pipeline 1711f4d46687bada72426d88cda417899e0ae3a4
shuiyisong Jun 5, 2024
1047dd7
chore: add log
shuiyisong Jun 5, 2024
2ff2fda
chore: add log
shuiyisong Jun 5, 2024
8b6a652
chore: remove open hook
shuiyisong Jun 5, 2024
6ca15ad
Merge pull request #8 from shuiyisong/refactor/log
paomian Jun 5, 2024
1298b0a
chore: minor update
shuiyisong Jun 5, 2024
ea548b0
chore: fix fmt
shuiyisong Jun 5, 2024
fb13278
Merge pull request #9 from shuiyisong/refactor/log
paomian Jun 5, 2024
6c88b89
chore: minor update
shuiyisong Jun 5, 2024
eeed85e
chore: rename desc for pipeline table
shuiyisong Jun 5, 2024
f77d20b
refactor: remove updated_at in pipelines
shuiyisong Jun 5, 2024
38ed6bb
Merge pull request #10 from shuiyisong/chore/polish_code
paomian Jun 5, 2024
5815675
chore: add more content type support for log inserter api
paomian Jun 5, 2024
c84ef0e
Merge pull request #11 from paomian/feat/log-handler-v2
paomian Jun 5, 2024
2e69655
chore: introduce pipeline crate
shuiyisong Jun 5, 2024
ca9525d
Merge branch 'chore/introduce_pipeline' into feat/log-handler
shuiyisong Jun 5, 2024
85a4c32
Merge branch 'main' into feat/log-handler
shuiyisong Jun 6, 2024
77ef015
chore: update upload pipeline api
paomian Jun 6, 2024
43a57a7
chore: fix by pr commit
paomian Jun 6, 2024
3560285
chore: add some doc for pub fn/struct
paomian Jun 6, 2024
4872c8a
chore: some minro fix
paomian Jun 6, 2024
11933b0
chore: add pipeline version support
paomian Jun 6, 2024
92a2bda
chore: impl log pipeline version
paomian Jun 7, 2024
8216854
chore: fix format issue
paomian Jun 12, 2024
6aed131
fix: make the LogicalPlan of a query pipeline sorted in desc order
paomian Jun 12, 2024
db827df
chore: remove some debug log
paomian Jun 13, 2024
998b243
chore: replacing hashmap cache with moak
paomian Jun 14, 2024
12c3e04
chore: fix by pr commit
paomian Jun 14, 2024
7820cd7
chore: fix toml format issue
paomian Jun 14, 2024
459748b
Merge remote-tracking branch 'origin/main' into feat/log-handler
paomian Jun 14, 2024
0d35e1a
chore: update Cargo.lock
paomian Jun 14, 2024
b57eed6
chore: fix by pr commit
paomian Jun 14, 2024
4433e6c
chore: fix some issue by pr commit
paomian Jun 14, 2024
a6734d3
chore: add more doc for pipeline version
paomian Jun 14, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 41 additions & 4 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 3 additions & 1 deletion src/frontend/src/instance.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
pub mod builder;
mod grpc;
mod influxdb;
mod log_handler;
mod opentsdb;
mod otlp;
mod prom_store;
Expand Down Expand Up @@ -66,7 +67,7 @@ use servers::prometheus_handler::PrometheusHandler;
use servers::query_handler::grpc::GrpcQueryHandler;
use servers::query_handler::sql::SqlQueryHandler;
use servers::query_handler::{
InfluxdbLineProtocolHandler, OpenTelemetryProtocolHandler, OpentsdbProtocolHandler,
InfluxdbLineProtocolHandler, LogHandler, OpenTelemetryProtocolHandler, OpentsdbProtocolHandler,
PromStoreProtocolHandler, ScriptHandler,
};
use servers::server::ServerHandlers;
Expand Down Expand Up @@ -100,6 +101,7 @@ pub trait FrontendInstance:
+ OpenTelemetryProtocolHandler
+ ScriptHandler
+ PrometheusHandler
+ LogHandler
+ Send
+ Sync
+ 'static
Expand Down
57 changes: 57 additions & 0 deletions src/frontend/src/instance/log_handler.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

use api::v1::RowInsertRequests;
use async_trait::async_trait;
use auth::{PermissionChecker, PermissionCheckerRef, PermissionReq};
use client::Output;
use common_error::ext::BoxedError;
use servers::error::{AuthSnafu, ExecuteGrpcRequestSnafu};
use servers::query_handler::LogHandler;
use session::context::QueryContextRef;
use snafu::ResultExt;

use super::Instance;
paomian marked this conversation as resolved.
Show resolved Hide resolved

#[async_trait]
impl LogHandler for Instance {
async fn insert_log(
paomian marked this conversation as resolved.
Show resolved Hide resolved
&self,
log: RowInsertRequests,
ctx: QueryContextRef,
) -> servers::error::Result<Output> {
self.plugins
.get::<PermissionCheckerRef>()
.as_ref()
// This is a bug, it should be PermissionReq::LogWrite
.check_permission(ctx.current_user(), PermissionReq::PromStoreWrite)
paomian marked this conversation as resolved.
Show resolved Hide resolved
.context(AuthSnafu)?;

self.handle_log_inserts(log, ctx).await
}
killme2008 marked this conversation as resolved.
Show resolved Hide resolved
}

impl Instance {
pub async fn handle_log_inserts(
&self,
log: RowInsertRequests,
ctx: QueryContextRef,
) -> servers::error::Result<Output> {
self.inserter
.handle_log_inserts(log, ctx, self.statement_executor.as_ref())
.await
.map_err(BoxedError::new)
.context(ExecuteGrpcRequestSnafu)
}
}
2 changes: 2 additions & 0 deletions src/frontend/src/server.rs
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,8 @@ where
Some(self.instance.clone()),
);

builder = builder.with_log_ingest_handler(self.instance.clone());

if let Some(user_provider) = self.plugins.get::<UserProviderRef>() {
builder = builder.with_user_provider(user_provider);
}
Expand Down
152 changes: 120 additions & 32 deletions src/operator/src/insert.rs
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,12 @@ pub struct Inserter {

pub type InserterRef = Arc<Inserter>;

enum TableType {
paomian marked this conversation as resolved.
Show resolved Hide resolved
Logical(String),
Physical,
Log,
}
evenyag marked this conversation as resolved.
Show resolved Hide resolved

impl Inserter {
pub fn new(
catalog_manager: CatalogManagerRef,
Expand Down Expand Up @@ -109,7 +115,37 @@ impl Inserter {
validate_column_count_match(&requests)?;

let table_name_to_ids = self
.create_or_alter_tables_on_demand(&requests, &ctx, None, statement_executor)
.create_or_alter_tables_on_demand(
&requests,
&ctx,
TableType::Physical,
statement_executor,
)
.await?;
let inserts = RowToRegion::new(table_name_to_ids, self.partition_manager.as_ref())
.convert(requests)
.await?;

self.do_request(inserts, &ctx).await
}

pub async fn handle_log_inserts(
&self,
mut requests: RowInsertRequests,
ctx: QueryContextRef,
statement_executor: &StatementExecutor,
) -> Result<Output> {
// remove empty requests
requests.inserts.retain(|req| {
killme2008 marked this conversation as resolved.
Show resolved Hide resolved
req.rows
.as_ref()
.map(|r| !r.rows.is_empty())
.unwrap_or_default()
});
validate_column_count_match(&requests)?;

let table_name_to_ids = self
.create_or_alter_tables_on_demand(&requests, &ctx, TableType::Log, statement_executor)
.await?;
let inserts = RowToRegion::new(table_name_to_ids, self.partition_manager.as_ref())
.convert(requests)
Expand Down Expand Up @@ -144,7 +180,7 @@ impl Inserter {
.create_or_alter_tables_on_demand(
&requests,
&ctx,
Some(physical_table.to_string()),
TableType::Logical(physical_table.to_string()),
statement_executor,
)
.await?;
Expand Down Expand Up @@ -366,7 +402,7 @@ impl Inserter {
&self,
requests: &RowInsertRequests,
ctx: &QueryContextRef,
on_physical_table: Option<String>,
table_type: TableType,
statement_executor: &StatementExecutor,
) -> Result<HashMap<String, TableId>> {
let mut table_name_to_ids = HashMap::with_capacity(requests.inserts.len());
Expand Down Expand Up @@ -394,42 +430,56 @@ impl Inserter {
}
}

paomian marked this conversation as resolved.
Show resolved Hide resolved
if let Some(on_physical_table) = on_physical_table {
if !create_tables.is_empty() {
// Creates logical tables in batch.
let tables = self
.create_logical_tables(
create_tables,
ctx,
&on_physical_table,
statement_executor,
)
.await?;
match table_type {
TableType::Logical(on_physical_table) => {
if !create_tables.is_empty() {
// Creates logical tables in batch.
let tables = self
.create_logical_tables(
create_tables,
ctx,
&on_physical_table,
statement_executor,
)
.await?;

for table in tables {
for table in tables {
let table_info = table.table_info();
table_name_to_ids.insert(table_info.name.clone(), table_info.table_id());
}
}
if !alter_tables.is_empty() {
// Alter logical tables in batch.
statement_executor
.alter_logical_tables(alter_tables, ctx.clone())
.await?;
}
}
TableType::Physical => {
for req in create_tables {
let table = self.create_table(req, ctx, statement_executor).await?;
let table_info = table.table_info();
table_name_to_ids.insert(table_info.name.clone(), table_info.table_id());
}
for alter_expr in alter_tables.into_iter() {
statement_executor
.alter_table_inner(alter_expr, ctx.clone())
.await?;
}
}
if !alter_tables.is_empty() {
// Alter logical tables in batch.
statement_executor
.alter_logical_tables(alter_tables, ctx.clone())
.await?;
}
} else {
for req in create_tables {
let table = self.create_table(req, ctx, statement_executor).await?;
let table_info = table.table_info();
table_name_to_ids.insert(table_info.name.clone(), table_info.table_id());
}
for alter_expr in alter_tables.into_iter() {
statement_executor
.alter_table_inner(alter_expr, ctx.clone())
.await?;
TableType::Log => {
for req in create_tables {
let table = self.create_log_table(req, ctx, statement_executor).await?;
let table_info = table.table_info();
table_name_to_ids.insert(table_info.name.clone(), table_info.table_id());
}
for alter_expr in alter_tables.into_iter() {
statement_executor
.alter_table_inner(alter_expr, ctx.clone())
.await?;
}
}
}

Ok(table_name_to_ids)
}

Expand Down Expand Up @@ -571,6 +621,44 @@ impl Inserter {
}
}

async fn create_log_table(
&self,
req: &RowInsertRequest,
ctx: &QueryContextRef,
statement_executor: &StatementExecutor,
) -> Result<TableRef> {
let table_ref =
TableReference::full(ctx.current_catalog(), ctx.current_schema(), &req.table_name);

let request_schema = req.rows.as_ref().unwrap().schema.as_slice();
killme2008 marked this conversation as resolved.
Show resolved Hide resolved
let create_table_expr = &mut build_create_table_expr(&table_ref, request_schema)?;

info!("Table `{table_ref}` does not exist, try creating table");
paomian marked this conversation as resolved.
Show resolved Hide resolved
create_table_expr
.table_options
.insert("append_mode".to_string(), "true".to_string());
paomian marked this conversation as resolved.
Show resolved Hide resolved
let res = statement_executor
.create_table_inner(create_table_expr, None, ctx.clone())
.await;

match res {
Ok(table) => {
info!(
"Successfully created table {}.{}.{}",
paomian marked this conversation as resolved.
Show resolved Hide resolved
table_ref.catalog, table_ref.schema, table_ref.table,
);
Ok(table)
}
Err(err) => {
error!(
"Failed to create table {}.{}.{}: {}",
paomian marked this conversation as resolved.
Show resolved Hide resolved
table_ref.catalog, table_ref.schema, table_ref.table, err
);
Err(err)
}
}
}

async fn create_logical_tables(
&self,
create_tables: Vec<&RowInsertRequest>,
Expand Down
3 changes: 2 additions & 1 deletion src/servers/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,8 @@ tower = { workspace = true, features = ["full"] }
tower-http = { version = "0.4", features = ["full"] }
urlencoding = "2.1"
zstd.workspace = true

paomian marked this conversation as resolved.
Show resolved Hide resolved
#pipeline = { git = "ssh://[email protected]/GreptimeTeam/pipeline.git", rev = "6b88c3c627da9e20f8fd160071e9c69b3ebd4e6a" }
pipeline = { path = "../../../pipeline" }
paomian marked this conversation as resolved.
Show resolved Hide resolved
[target.'cfg(not(windows))'.dependencies]
tikv-jemalloc-ctl = { version = "0.5", features = ["use_std"] }

Expand Down
Loading
Loading