Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Integrating SQL/PPL query languages into DSL via the _search API #12434

Closed
vmmusings opened this issue Feb 22, 2024 · 24 comments
Closed

[RFC] Integrating SQL/PPL query languages into DSL via the _search API #12434

vmmusings opened this issue Feb 22, 2024 · 24 comments
Labels
enhancement Enhancement or improvement to existing feature or request RFC Issues requesting major changes Search:Query Capabilities

Comments

@vmmusings
Copy link
Member

vmmusings commented Feb 22, 2024

SQL/PPL via DSL in Search API.

1. Problem Statement.

Today, OpenSearch offers support for SQL and PPL query languages through the plugin endpoints _plugins/sql and _plugins/ppl. However, clients using OpenSearch client libraries face limitations, as these libraries do not accommodate with plugin endpoints.To increase adoption with minimal disruption, our proposal introduces new SQL and PPL clauses directly into the SearchRequest body. This approach aims to facilitate the use of these languages through the Search API, streamlining access and integration for users.

2. Summary

  1. Add response selector to _search API (e.g. "result_format":"hit_object" vs "result_format":"datarow" ) to support existing format or datarows. By default we still use hit_object.
  2. Add SQL query support to _search API. If you send a SQL query and don’t explicitly specify result_format, the format defaults to datarow.

From user perspective, the following example demonstrate SQL vis DSL request and response.

### Request
POST {{baseUrl}}/_search
Content-Type: application/x-ndjson

{
  "sql": {
    "query": "select 1"
  }
}

### Resonse
{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 0,
    "successful": 0,
    "skipped": 0,
    "failed": 0
  },
  "datarows": {
    "schema": [
      {
        "name": "1",
        "type": "integer"
      }
    ],
    "datarows": [
      [
        1
      ]
    ],
    "total": 1,
    "size": 1
  }
}

In this doc, we will discuss detailed design, limitations, and development plan.

3. Tenets:

  • Minimal disruption to the existing use cases and SQL Plugin APIs.
    • This helps in continuing existing support to Observability Plugins, JDBC, ODBC drivers.
  • Minimal duplication of code and maintenance across different use cases.
  • Query execution should uphold the security polices defined in the execution context.
  • The new functionality should be supported for both transport and rest high level clients. Any changes made should be under the transport layer but not in Rest layer.

4. Solution

4.1.Search API

4.1.1. Endpoint

Category Method Path SQL Support Description
Search GET /target-index/_search No Support SQL FROM clause specify the index
Search GET /_search Support
Search POST /target-index/_search No Support SQL FROM clause specify the index
Search POST /_search Support
Scroll ALL ALL No Support SQL use LIMIT and OFFSET retrieve a portion of the rows. The syntax is not aligned with scroll API.
Multi-Search GET _msearch Support
Multi-Search GET /target-indices/_msearch No Support SQL FROM clause specify the indices
Multi-Search POST _msearch Support
Multi-Search POST /target-indices/_msearch No Support SQL FROM clause specify the indices

DSL is being used in above APIs majorly and also asynchronous search. Validation exception would be thrown whenever a sql block is encountered in above unsupported APIs.

4.1.2.URL Parameters

All URL parameters are not supported.

4.1.3.Request Body

Sample Request Body:

localhost:9200/_search
{
   "ppl" : {
        "query" : "source = accounts"
    }
}

OR

{
   "sql" : {
        "query" : "select * from accounts"
    }
}
Field Type Description SQL
aggs Object In the optional aggs parameter, you can define any number of aggregations. Each aggregation is defined by its name and one of the types of aggregations that OpenSearch supports. For more information, see Aggregations. No Support
docvalue_fields Array of objects The fields that OpenSearch should return using their docvalue forms. Specify a format to return results in a certain format, such as date and time. No Support
fields Array The fields to search for in the request. Specify a format to return results in a certain format, such as date and time. No Support
explain String Whether to return details about how OpenSearch computed the document’s score. Default is false. No Support
from Integer The starting index to search from. Default is 0. No Support
indices_boost Array of objects Values used to boost the score of specified indexes. Specify in the format of : No Support
min_score Integer Specify a score threshold to return only documents above the threshold. No Support
query Object The DSL query to use in the request. No Support
seq_no_primary_term Boolean Whether to return sequence number and primary term of the last operation of each document hit. No Support
size Integer How many results to return. Default is 10. No Support
_source Whether to include the _source field in the response. No Support
stats String Value to associate with the request for additional logging. No Support
terminate_after Integer The maximum number of documents OpenSearch should process before terminating the request. Default is 0. No Support
timeout Time How long to wait for a response. Default is no timeout. Support
version Boolean Whether to include the document version in the response. No Support
sql Object New Field

4.1.4.Response

If query type is SQL, response format is datarows. query response include datarows section. datarows section include schema and datarows, for example

{  
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 0,
    "successful": 0,
    "skipped": 0,
    "failed": 0
  },
  "**datarows_output**" : {
  "schema" : [
    {
      "name" : "firstname",
      "type" : "text"
    },
    {
      "name" : "lastname",
      "type" : "text"
    },
    {
      "name" : "age",
      "type" : "long"
    }
  ],
  "datarows" : [
    [
      "Nanette",
      "Bates",
      28
    ],
    [
      "Amber",
      "Duke",
      32
    ]
  ]
  }
}

4.2.Feature Parity - SQL vs DSL

Currently, SQL does not fully support all DSL query and aggregations. The following table highlight key query features missing support in SQL

Category SQL
Compound Support
Full text queries Support
Geo queries No Support
Shape queries No Support
Joinning queries No Support
Span queires No Support
Specialized queries No Support
Term-Level queries Support

and metrics aggregation function missing in SQL

Category SQL
Geo-bounds No Support
Gen-centroid No Support
Percentils No Support
Rate No Support
T-test No Support

4.3.Performance

SQL queries through the search endpoint should offer performance comparable to that of DSL queries. Users should not experience any degradation in performance. We use OpenSearch benchmark framework to compare DSL query and SQL via Search.

4.4.Client

All OpenSearch client should support SQL query and datarow response. We have three different types of clients.

4.5.Security

Calls to SQL via _search include index names in the request body, so they have the same access policy considerations as the bulk, mget, and msearch operations.

5. Detailed Design.

5.1 Approach 1: Extend SearchPlugin Interface and Integrate with SQL plugin.

In this approach, we will add a new function to the existing SearchPlugin interface to introduce a new construct called QueryEngineSpec. Plugins can implement the SearchPlugin interface to introduce a query engine that takes over the search request to produce the SearchResponse. QueryEngineSpec defines the name of the spec, which will be the key in the DSL under which the respective query engine request parameters are enclosed. Once OpenSearch-core receives a request with a clause containing a key defined by a QueryEngineSpec, OpenSearch-core creates the Query Engine and transfers the request to the plugin via the Query Engine Object.

Screenshot 2024-02-16 at 12 47 17 AM

We would be introducing a new field data_rows_output in InternalSearchResponse. SQL, PPL Plugins would populate this field and also other meta information of took, shards, timed_out information. In case of normal DSL query, hits object would be formulated.

5.2 Approach 2: Leverage Search Pipelines and introduce new SQL/PPL processors.

In the current state, search pipelines offer the following types of processors:

  • Request Processors → Transform SearchRequest.
  • Response Processors → Transform SearchResponse.
  • Search Phase Results Processors → Transform SearchResults between the query and fetch phases.

The Flow framework proposes to include a new type of search processor in Pipelines. This search processor would take a SearchRequest and produce a SearchResponse. We can leverage this new Search Processor type to introduce SQLSearchProcessor and PPLSearchProcessor, which take over the Search Request whenever there is an SQL and PPL block in the search request, respectively. Since an SQL request is a blocking operation, we would build something like processResponseAsync, which takes in a SearchResponse Listener. For including SQL/PPL-related request and response bodies, we could leverage the already existing ext clauses feature in SearchResponse and SearchRequest.

Screenshot 2024-02-13 at 10 12 52 AM

Request Flow

  • User sends a _search request with sql/ppl block in ext block.
  • SearchSourceBuilder would parse this block and will put the extBuilders in SearchRequest.
  • [Need further deep dive] SearchPipelineService identifies these extBuilders and include SQLSearchProcessors and PPLSearchProcessors respectively in the Pipeline. Should we include the SQL and PPL search processors in default pipeline ?
  • SQL or PPL SearchProcessor would respond back with the SearchResponse through listener passed from the CORE.

5. Task breakdown

Stage Task Effort Owner Status
P0 Design Alignment 2W Vamsi Manohar
P0 OpenSearch Core QueryEngineSpec Changes 2W
P0 OpenSearch Core Validation Changes 1W
P0 OpenSearch Core UTs/ITS 1W
P0 SQL Changes for supporting PPL, UTs, ITS 2W
P0 SQL Changes for supporting SQL, UTs, ITS 2W
P0 SQL Changes for supporting PPL/SQL Explain 1W
P0 Performance benchmark 2W
P0 Threat Modelling and Pen testing 2W
P0 Java Rest client 2W
P0 Java client 2W
P0 JavaScript client 4W Requires changes only in opensearch-api-specification. Query Params might require separate handling.
P1 Python client
P1 Go client
P1 Ruby client
P1 PHP client
P1 .NET client
P1 Rust client
P1 SQL Feature Parity - Query 8W
P1 SQL Feature Parity - Aggregation 8W

6. Open Questions and Edge Case Scenarios.

7. POC:

@vmmusings vmmusings added enhancement Enhancement or improvement to existing feature or request untriaged labels Feb 22, 2024
@anirudha
Copy link

approach 1 (recommended)/ seem independent for query languages. which are a low level building block that can be used in option 2 anyways

@vmmusings
Copy link
Member Author

vmmusings commented Feb 22, 2024

POC Video:
sql_in_dsl-ezgif com-optimize

Draft PRs:

@wbeckler
Copy link

Why not modify the client so that it access the plugins/sql endpoint?

@model-collapse
Copy link

What is the usecase for this approach?

@navneet1v
Copy link
Contributor

navneet1v commented Feb 27, 2024

@vamsi-amazon have you thought introducing the ppl/sql as a another query clause rather than coming with a new concept Query engine?

Currently in Opensearch you can define a query type along with how to parse and convert the query into apt Lucene query clause. Wondering if we have explored that option and what is the reason for not choosing that option and rather than building a new concept all together?

This will have many advantages:

  1. User can fit this new ppl/sql query clause with any other complex query.
  2. You will get out box support for various features of search like concurrent segment search etc.
  3. Aggregations and other features can be directly supported with this.

@peternied peternied changed the title [Draft] Integrating SQL/PPL query languages into DSL via the _search API [RFC] Integrating SQL/PPL query languages into DSL via the _search API Feb 28, 2024
@peternied peternied added the RFC Issues requesting major changes label Feb 28, 2024
@peternied
Copy link
Member

[Triage - attendees 1 2 3 4 5]
@vamsi-amazon Thanks for filing this RFC looking forward to seeing where this topic lands.

@peternied
Copy link
Member

However, clients using OpenSearch client libraries face limitations, as these libraries do not accommodate plugin endpoints.

Where is this problem explored? If client library support was improved it would benefit the SQL plugin and all other plugins for OpenSearch.

@penghuo
Copy link
Contributor

penghuo commented Mar 14, 2024

How does approach 1 works with OpenSearch client library? Do we plan to upgrade OpenSearch client library to support new SQL/PPL query type?

@anirudha
Copy link

anirudha commented Mar 15, 2024

Why not modify the client so that it access the plugins/sql endpoint?

We already support driver in JDBC / ODBC and dbapi. other opensearch clients would need to support the jdbc/odbc spec and enable access via SQL / PPL; this can be done in the fullness of time. Our goals are not just client user access / but also developer access without introducing inter-plugin dependencies. Many of our users still use the dsl and hand craft DSL queries.

The proposal is not to move code but maintain code modularity by adding a QueryEngineSpec.

@vamsi-amazon have you thought introducing the ppl/sql as a another query clause rather than coming with a new concept Query engine?

Currently in Opensearch you can define a query type along with how to parse and convert the query into apt Lucene query clause. Wondering if we have explored that option and what is the reason for not choosing that option and rather than building a new concept all together?

This will have many advantages:

  1. User can fit this new ppl/sql query clause with any other complex query.
  2. You will get out box support for various features of search like concurrent segment search etc.
  3. Aggregations and other features can be directly supported with this.

SQL is an independent high-level query language hence 1 doesn't apply ; 2,3 can still be used

How does approach 1 works with OpenSearch client library? Do we plan to upgrade OpenSearch client library to support new SQL/PPL query type?

no, it doesn't need to work out of the box. SQL drivers JDBC/ODBC/DBAPI will continue to work for the developers and users.

What is the usecase for this approach?

Integrating SQL/PPL into OpenSearch as standard languages enhances its utility and accessibility. For users, it promises compatibility with JDBC/ODBC and DBAPI clients, opening up OpenSearch to a wider audience. All features, including dashboards, will eventually support SQL/PPL by default, increasing usability. For developers, incorporating these features into the core simplifies development, avoids plugin dependencies while ensures backward compatibility, making OpenSearch a more unified platform for querying. This move positions OpenSearch as a leading relevancy-focused SQL engine with advanced capabilities like highlighting and full-text queries.

PPL reference manual
https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/index.rst

SQL reference manual
https://github.com/opensearch-project/sql/blob/main/docs/user/index.rst

developer docs
https://github.com/opensearch-project/sql/blob/main/docs/dev/index.md

Getting started developer guide
https://github.com/opensearch-project/sql/blob/main/DEVELOPER_GUIDE.rst

Drivers
https://opensearch.org/downloads.html

this approach will/

  • Streamlines access to SQL and PPL through the standard Search API, enhancing usability.
  • Encourages broader adoption by making SQL and PPL features more accessible to users unfamiliar with plugin-specific endpoints.
  • Supports extensibility through the QueryEngineSpec, allowing for custom query engine implementations.
  • Improves system architecture by leveraging existing interfaces and patterns, promoting a more unified and coherent platform design.
  • Enhances the flexibility of OpenSearch by accommodating various query languages within a unified framework.
  • Addresses current limitations and gaps in functionality with respect to SQL and PPL usage in OpenSearch plugins/ecosystem.
  • Aims to maintain backward compatibility and minimize disruption to existing workflows and applications.
  • enable opensearch plugin like alerting to support SQL and PPL based alerts

Opensearch client don't need to support SQL / PPL by default- they are supported by the jdbc/odbc spec'ed drivers and dbapi. Since this is an optional clause clients can ignore it. Search pipelines is not a low level feature to implement a fundamental query language.

@msfroh
Copy link
Collaborator

msfroh commented Mar 15, 2024

Streamlines access to SQL and PPL through the standard Search API, enhancing usability.

This is the part that bugs me. It's not using the standard Search API.

We want to access SQL/PPL with JDBC/ODBC clients. Sure. The requests are not _search API requests (i.e. SearchRequest). The responses are not _search API responses (i.e. SearchResponse).

Given that we have no interest in SearchRequest and SearchResponse, what does this have to do with the _search API?

For example, could I add a QueryEngineSpec called math, where I send a _search request, like:

localhost:9200/_search
{
   "math" : {
        "query" : "5 * 10 + 3"
    }
}

Then I get back a response like:

{
 "took": 0, 
"timed_out": false, 
"_shards": {
        "total": 0, 
        "successful": 0, 
        "skipped": 0, 
        "failed": 0 
}, 
"hits": {
   "dummy"
}, 
  "math": {
    "answer": 53
  }
}

Is that something we want to support? What things go into the _search API versus their own APIs? Does it make sense to read cluster settings from _search APIs?

There's nothing stopping me from adding a /_math endpoint via a plugin that can support its own API directly:

// Request:
localhost:9200/_math
{
  "expression" : "5 * 10 + 3"
}

// Response:
{
  "answer": 53
}

@dblock
Copy link
Member

dblock commented Mar 15, 2024

Is that something we want to support? What things go into the _search API versus their own APIs?

This is the right question, for which I think we need some tenets. Search is over documents that are stored in indexes.

To me, search is defined by 1) parses a query written in some language, 2) evaluates every stored document against that query, 3) matches or doesn't match the document, 4) produces a score for all documents that match, then 5) sorts results my score and 6) returns them.

Is this an acceptable definition @msfroh?

If so, in the case of the math example or settings you're missing 2), 3), 4) and 5), so it doesn't fit under search. In the case of SQL I think it fits that definition where the language to express the query is different.

@dblock
Copy link
Member

dblock commented Mar 15, 2024

We want to access SQL/PPL with JDBC/ODBC clients. Sure. The requests are not _search API requests (i.e. SearchRequest). The responses are not _search API responses (i.e. SearchResponse).

This is confusing to me. I suppose I don't understand internals. I think there should be a SearchRequest and SearchResponse independent of the transport API, aka we need RestSearchRequest < SearchRequest, ODBCSearchRequest < SearchRequest, etc.

@msfroh
Copy link
Collaborator

msfroh commented Mar 15, 2024

This is confusing to me. I suppose I don't understand internals. I think there should be a SearchRequest and SearchResponse independent of the transport API, aka we need RestSearchRequest < SearchRequest, ODBCSearchRequest < SearchRequest, etc.

Aha! I like this.

I think this gets into some of the question of "What is the input used to perform an internal operation versus what is the representation sent over the wire?" that touches on the challenge that @VachaShah has encountered on her Protobuf work, made difficult by the fact that business objects have historically defined their own wire format.

I believe the approach in this proposal is "How can we embed a SQL/PPL representation of a search request inside the existing REST _search API?" Maybe instead, it should be "How can the _search API accommodate different representations of a search request?"

This almost feels like we want to support a different Content-Type for the API (albeit with a more significant interpretation of Content-Type versus the existing XContent framework from which the "business objects == serialized objects" evil arises.) Of course, dispatching a request to a /_search endpoint and forking the logic by Content-Type isn't fundamentally different from just hitting a different endpoint.

Of course, once we're on the cluster, we're forking down completely different paths. The existing SearchRequest class is married to query DSL and is remarkably low-level in its specificity. I gather that the SQL/PPL logic goes and does very different "stuff" that may eventually trigger DSL queries of its own. Ultimately, I don't think we could reasonably say that a SQL/PPL request "extends" a SearchRequest without moving, well, everything into the separate RestSearchRequest -- once you've moved the DSL-specific stuff out, there's not much left.

@andrross
Copy link
Member

In the case of SQL I think it fits that definition where the language to express the query is different.

@dblock I think by your definition the took, _shards, hits, etc. fields in the response would be relevant for any search request, but this proposal explicitly calls them "unrelated parameters". If we truly need to exclude those fields then it feels like we're shoehorning something into the API similar to @froh's contrived math example.

@msfroh
Copy link
Collaborator

msfroh commented Mar 15, 2024

I think I see a way that we can handle this, albeit in two steps:

  1. On the response side, we add support for the schema/datarows concept as an OpenSearch core feature, where anyone can request it (even if they're doing a DSL query). This is arguably a better response format for a lot of use-cases and it's a good feature. (While Lucene and therefore OpenSearch supports a flexible schema where every doc can have its own set of fields, in practice you tend to return a bunch of docs with the same fields -- otherwise it's really hard to use.)
  2. On the request side, we use something like this QueryEngine proposal to process a SearchRequest (including different syntax) and get back a SearchResponse. In this case, from an architecture standpoint, I would perhaps suggest (as in @dblock's message above) we consider it as "different" from a regular REST search request, and we fork off at the REST layer.

That way, we preserve the "SearchResponse returns documents" part that my contrived math example doesn't (though it could send an answer back in a document 😄 ). From a code standpoint, we could split off before trying to parse into a SearchSourceBuilder.

@anirudha
Copy link

In the case of SQL I think it fits that definition where the language to express the query is different.

@dblock I think by your definition the took, _shards, hits, etc. fields in the response would be relevant for any search request, but this proposal explicitly calls them "unrelated parameters". If we truly need to exclude those fields then it feels like we're shoehorning something into the API similar to @froh's contrived math example.
While I agree with @dblock and @msfroh , lets do this lets to try and standardize;
-> but, if we see the DSL structure today; there is to spec or structure.

Opensearch response today is divided in to roughly 2 parts

  1. response metadata ( shards, hits etc. )
  2. response data

we should be able to fill all the response metadata fields; but the response data format I propose be jdbc spec'ed. that is easy/simple to use and understand no matter what aggregation is used.


eg.
{
 "took": 23.1, 
"timed_out": false, 
"_shards": {
        "total": 2, 
        "successful": 2, 
        "skipped": 0, 
        "failed": 0 
}, 
"hits": {
   "324"
}, 
  "ppl": {
    "schema": [...],
    "datarows" : [....]
    }
}

@dblock wrt how you are defining tenets/
i agree, we should have search first experience, not math for example.

SQL /PPL support almost-all opensearch relevancy features in an easy to use high level language
https://github.com/opensearch-project/sql/blob/main/docs/user/beyond/fulltext.rst

this would be the only / most powerful SQL dialect that support all relevancy features in a SQL / Piped language which is on a search engine

agree with @msfroh on the final comment here/
#12434 (comment)

@dblock
Copy link
Member

dblock commented Mar 19, 2024

@vamsi-amazon Thoughts on updating the proposal above with the information discussed?

@penghuo
Copy link
Contributor

penghuo commented Mar 20, 2024

agree with @msfroh on the final comment of support for the schema/datarows concept

we should also align on the scope and launch criteria of support SQL in _search endpoint

  • SQL does not support all the DSL query and aggregation, for instance, SQL does not support geo and shape queries. In case mixing SQL and native DSL queries in a single endpoint, any concern?
  • search on index (index00001/_search) does not align with sql from syntax, user can select any table in sql statement
    scroll does not align with sql pagination syntax.
  • scroll / pit search does not align with sql pagination syntax
  • not all search URI parameters can be supported. for instance, suggest_field
  • not all search body can be used along with SQL, for instance, docvalue_fields

we should also

  • Performance benchmark, SQL latency / resource usage should similar to DSL
  • we also need to align on that new feature introduced in DSL should has correspond implementation in SQL.
  • SQL support existing DSL features. It is not P0 feature, we can keep improvement.

@vmmusings
Copy link
Member Author

@dblock Got busy with 2.13 release for last couple of days. Will update the proposal with the information discussed.

@navneet1v
Copy link
Contributor

Maybe instead, it should be "How can the _search API accommodate different representations of a search request?"

This is truly awesome question and problem statement we should take deeper look if we want to support different query engine.

But here are some thoughts I have after reading the conversation:

  1. If we have different representation of what a SearchRequest can be basically translating the RestSearchRequest to bunch of engine specific Search Request aren't we just using OpenSearch as _search as a proxy layer to call different engine(say SQL hosted on a remote endpoint).
  2. RestSearchRequest can handle more than 1 type of engine, BulkRequest or IndexRequest should also support methods to index/put data into underline storage of Query engine.
  3. If do support lets say both Index and Search request to different engine, what we built out is Opensearch as a thin distributed system. So to avoid getting into this, we need to tie in Opensearch Index some how with Index and Search Request. Otherwise what is point of having a data-node and shard etc.? because all you need is a machine that is just converting one type of payload to engine specific payload aka Query.

@dblock
Copy link
Member

dblock commented Mar 22, 2024

@penghuo you have good examples of how SQL doesn't align well with other queries

@navneet1v Does your question boil down to the fact that search needs to be aware of what kind of index it is?

I think both comments are just generic questions of what's constant and what's variable (OO abstractions). For example scroll / pit search does not align with sql pagination syntax - "scrolling" is a common feature over data, aka there should be a set of interfaces that together implement scrolling, but inputs may differ between SQL (e.g. offset and limit) and DSL (e.g. cursor and size), and engines can have different implementations and effectively perform scrolling functions differently, all while data fetching or distributing requests to shards are common.

@anirudha
Copy link

Summarizing the recent discussions above and evaluation regarding the integration of SQL into the OpenSearch core.

We explored various strategies listed in the updated description/ for incorporating SQL into the OpenSearch core, ultimately recommending against integration in favor of enhancing client capabilities.

The initial approaches considered involve directly integrating SQL into the _search endpoint or as a new endpoint within the core, each presenting distinct advantages. However, these strategies also face significant drawbacks, including limited parity in DSL query support, non-uniform response structures, increased complexity in core repo, potential build system integration challenges and significantly increased cost in upfront development with no extra customer value over the current approach.

The preferred approach advocates for maintaining the current setup with added transport clients for SQL, presenting a minimal change strategy that ensures compatibility with other plugins.

(Preferred): No Core Integration; Enhance Client Capabilities and Add Cohesion in core dashboards features.

The preferred approach is aiming to introduce a transport API for SQL and a new transport client library. This method ensures SQL compatibility with other plugins while maintaining the current system's integrity and minimizing changes.


The most compelling reasons I would see for merging the SQL code into core is if we think that long term the SQL query engine might want to integrate directly into the low-level Lucene query engine and there might benefits to having the query DSL implementation living side-by-side with the SQL implementation. We'd probably want to tease the entire current query DSL engine out of the server module into its own thing, which could then also contain the SQL implementation. However, I don't really see this happening or a need for this anytime soon. SQL depends on and uses the DSL and custom scripting via DSL; It inherits all DSL advances that the core project will make.

If the plan is to keep the SQL implementation as either a plugin or module component in the core repository, then moving the code from one repository to another seems more like an administrative decision than a software architecture one. “/paraphrasing @andrross /

@kavilla
Copy link
Member

kavilla commented Jun 25, 2024

@dblock opensearch-project/OpenSearch-Dashboards#7081 if you have some time to check this one out too?

@dblock
Copy link
Member

dblock commented Jun 27, 2024

@kavilla it looks like this proposal was closed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request RFC Issues requesting major changes Search:Query Capabilities
Projects
Archived in project
Development

No branches or pull requests