Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPIKE] Create @loopback/boot to support declarative JSON/YAML files #441

Closed
raymondfeng opened this issue Jul 14, 2017 · 17 comments
Closed
Assignees

Comments

@raymondfeng
Copy link
Contributor

raymondfeng commented Jul 14, 2017

Staging goals:

  1. Be able to load LB 3.x artifacts including models/datasources
  2. Support LB next artifacts, such as model/datasource/repository/api-spec/...

Story

As a LoopBack developer, I'd like to declare my artifacts as JSON or YAML files so that I can continue enjoying the declarative experience.

Acceptance Criteria

  • On application boot:
    • Read each of the directories in src for their corresponding artifacts (models, datasources, controllers, etc)
    • For each artifact:
      • Parse the artifact as an object
      • Semantically validate that the object meets the requirements of the artifact type it specifies
      • Wire up the artifact to the application context using the appropriate artifact prefix

Example

Given a src directory:

src/
  datasources/
    foo.datasource.json

we'll want to parse the datasource artifact, which contains:

{
    "host": "127.0.0.1",
    "port": 3001,
    // etc
}

Once we've loaded the file, we'll want to ensure that it has all of the properties required for it to be a datasource (in this case, a "host" property might be required), and then bind it to the context.

const datasource = validateDatasource(dsObj);
app.bind(`datasources.${name}`).to(datasource);

Rules

  • If more than one artifact exists with a different extension (foo.model.json and foo.model.yaml)
    • prefer JSON over YAML
    • if there is a TS file and JSON/YAML, merge the JSON/YAML definition into the parsed TS class
      foo.model.json

Questions to Answer

  • How will we merge programmatic models (TypeScript classes) with declarative forms?
  • What artifacts will we support for declarative use?
    • models
    • datasources
    • ACL
    • middleware
    • remoting
    • boot scripts
    • repositories
      • If we're supporting them in declarative form, what is the schema for that?
  • What will the schema be for each of these artifacts?
@bajtos
Copy link
Member

bajtos commented Aug 23, 2017

Cross-posting from #533:

We should prescribe a convention for organizing different kinds of files in LoopBack-Next projects:

  • controllers
  • sequence actions
  • models and repositories
  • value providers
  • etc.

@jannyHou
Copy link
Contributor

Some suggestion from LoopBack3 user: the swagger module can also load dynamically created model, e.g. created in boot scripts.
See issue strongloop/loopback#3726

@twboc
Copy link

twboc commented Dec 16, 2017

Yes, I think that dynamic discovery of the models on boot time is quite important. We also had an internal discussion at the company that I work and dynamic updates on database model would be a plus. How we can get involved in development. Is there any documentation that specifies the formal process of creating a pull request to the next version (v4)

@dhmlau
Copy link
Member

dhmlau commented Dec 19, 2017

@twboc , thanks for your interest to create pull request to LB4! Here is our wiki page: https://github.com/strongloop/loopback-next/wiki/Contributing.

@kjdelisle
Copy link
Contributor

Rejecting due to incomplete criteria!

Questions:

  • What sort of naming convention is required for these declarative model formats (if any)?
  • If both JSON and YAML are present, which one takes precedent? (Or should it throw? Maybe make it configurable?)
  • What behaviour is expected for models that fail to correctly parse (invalid JSON/YAML)?
    • Silently ignore failure
    • Log failures (warnings)
    • Throw on failure (like LB3 currently does)
  • Is the declarative support intended to cover semantic validation of model definitions as well as syntactic validation? (Currently, there is no code in ModelDefinition that checks to see if parameters given are valid in a logical sense)
  • Is this a booter that follows typical Boot conventions, or a special case?

@dhmlau
Copy link
Member

dhmlau commented Feb 21, 2018

@raymondfeng ^^

@raymondfeng
Copy link
Contributor Author

We can use LoopBack v3 as the baseline for declarative support. Artifacts include but not limit to:

  • Configuration for application/component/server/controller/repository/model/connector/...
  • Model definitions
  • Datasources (config for connectors)
  • Actions
  • Swagger/OpenAPI specs
  • Other specs for extensions such as gRPC and GraphQL

What sort of naming convention is required for these declarative model formats (if any)?

The json/yaml files will be placed under corresponding folders corresponding to the artifact type. For example:

  • models/my-model.json: model definition in JSON
  • models/my-model.ts: model TS file

We should find out one file per artifact instance vs one file per artifact type (with keys for each artifact).

If both JSON and YAML are present, which one takes precedent? (Or should it throw? Maybe make it configurable?)

We should start with JSON only and expand to YAML or json5. IMO, JSON will take precedence similar as how Node.js require works for various extensions.

What behaviour is expected for models that fail to correctly parse (invalid JSON/YAML)?
Silently ignore failure
Log failures (warnings)
Throw on failure (like LB3 currently does)

Throw errors and abort.

Is the declarative support intended to cover semantic validation of model definitions as well as syntactic validation? (Currently, there is no code in ModelDefinition that checks to see if parameters given are valid in a logical sense)

If the validation can be expressed in JSON, such as a LB where condition, we can make it possible to be declarative.

Is this a booter that follows typical Boot conventions, or a special case?

Booters for different artifact types should follow similar conventions.

@kjdelisle
Copy link
Contributor

This is a spike that we should aim for ASAP; let's figure out what we want this to look like and get some code out for review and compare it to our expected UX and architecture.

@kjdelisle kjdelisle removed the non-MVP label Feb 22, 2018
@shimks
Copy link
Contributor

shimks commented Feb 22, 2018

Do we have a timebox for this task as well? Or its this issue falsely under 'Planning' at the moment?

@dhmlau dhmlau changed the title Create @loopback/boot to support declarative JSON/YAML files [SPIKE] Create @loopback/boot to support declarative JSON/YAML files Feb 23, 2018
@dhmlau
Copy link
Member

dhmlau commented Mar 27, 2018

Closing as resolved.

@dhmlau dhmlau closed this as completed Mar 27, 2018
@dhmlau
Copy link
Member

dhmlau commented Apr 5, 2018

During the meeting with @raymondfeng and @bajtos , we decided to close this ticket as it's been done. We have @loopback/boot but that's only to boot the typescript classes not the declarative files. I thought we don't have the spec / schema for what the declarative files should look like.

Reopening it to continue the discussion if it hasn't been done.

@dhmlau dhmlau reopened this Apr 5, 2018
@virkt25 virkt25 self-assigned this Apr 10, 2018
@bajtos
Copy link
Member

bajtos commented May 28, 2018

I watched the recording of your presentation, @virkt25. Great work! I like how well-thought was your proposal, I agree with most of what you proposed at the high-level.

I have a dozen of comments and discussion points, see below.

1

+1 for preserving TypeScript classes for Models and DataSources, they are an integral part of test-first development process described in http://loopback.io/doc/en/lb4/Implementing-features.html.

2

I agree with @raymondfeng that having a DataSource base class is too much and that we should scaffold a single user-editable DataSource class the first time a data source is created (edited).

3

In your proposal, the configuration (loaded from JSON) is hard-coded in the generated (base) class. At the same time, we need to support different configurations for different environments (dev/test/production) and load configuration from environment variables (https://12factor.net).

It was proposed to parse process.env variables inside datasource files. I consider that as an anti-pattern, because it couples high-level datasources with low/system-level configuration code. It makes the code difficult to test, because tests would have to manipulate global process.env. Instead, we should find a way how to inject ENV variables in such way that the actual usage of process.env stays in the top-level index.ts or application.ts file only.

On a similar topic, @raymondfeng proposed a sort of a registry holding datasource configuration for all environments in the app-level Context. I don't think this is a good solution - consider the case where an application is deployed to different geo location and where each geo location requires different connection strings (to use a colocated database instance). In other words, a production config is not a single object, but a set of location-specific objects.

My conclusion is that we should decouple datasource configuration from datasource implementation class and leverage dependency injection to provide arbitrary runtime-specific configuration from the application level. IMO, this addressed both needs 1) have different datasource config depending on dev/test/prod environment 2) build datasource configuration dynamically, either from process.env or perhaps external config providers like https://www.consul.io.

A mock-up app:

class MyApp extends RestApplication {
  async boot() {
    const dbConfig = 
      // Figure out what config to use.  @loopback/boot can provide
      // helpers to load config from ENV variables
    this.bind('datasources.db$config').to(dbConfig);
    // etc.
  }
}

A mock-up datasource:

// default config is used in tests
const defaultConfig = require('./db.datasource.json');

class DbDataSource extends juggler.DataSource {
  constructor(
    @inject('datasources.db$config')
    config: DataSourceOptions = defaultConfig
  ) {
    super(config);
  }
}

4

-1 for using in-memory datasources in acceptance tests. In my experience, acceptance tests need to run against a real database. Let me quote from our Best Practices, adding emphasis on my own:

To create a great test suite, think smaller and favor fast, focused unit-tests over slow application-wide end-to-end tests.

Say you are implementing the “search” endpoint of the Product resource described earlier. You might write the following tests:

  • One “acceptance test”, where you start the application, make an HTTP request to search for a given product name, and verify that expected products were returned. This verifies that all parts of the application are correctly wired together.
  • Few “integration tests” where you invoke ProductController API from JavaScript/TypeScript, talk to a real database, and verify that the queries built by the controller work as expected when executed by the database server.
  • Many “unit tests” where you test ProductController in isolation and verify that the controller handles all different situations, including error paths and edge cases.

Why is it important to run both acceptance and integration tests against a real database:

  1. Verify that all parts of the application work together correctly. If we re-configure datasources to in-memory, we may miss a bug in the way how datasources are configured outside of tests.
  2. Verify that the queries make by Controllers via Repositories work as expected in the database. For example, certain database require indexes to be set up before a collection can be queried. Other databases may implement features like geo/nearby search differently from our in-memory connector.

Having said that, I acknowledge the need to configure datasources differently in dev and production. I just don't think we should leverage this feature for acceptance tests.

5

The generated model base class uses TypeScript decorators to describe model properties. I find this solution suboptimal, because it creates two places where model properties are defined. I am concerned that e.g. developers may forget to run regenerate command after editing model JSON and then be surprised why their changes were not picked up by our runtime.

Instead, I am proposing to leverage the existing static property Entity.definition. See https://github.com/strongloop/loopback-next/blob/ba138cc5b93bdd353d583006ca6cf9722f9d6e25/packages/repository/src/model.ts#L138-L140

My idea is to load model configuration from the JSON file and store it in this definition property.

const definition: ModelDefinitionSyntax = require('./todo-definition.json');

export class TodoBase extends Entity {
  static definition: ModelDefinition = new ModelDefinition(definition);
  id?: number;
  title: string;
  desc?: string;
  isComplete?: boolean;
}

Please note that under the hood of the decorator-based approach, the @model decorator builds the definition property too: https://github.com/strongloop/loopback-next/blob/ba138cc5b93bdd353d583006ca6cf9722f9d6e25/packages/repository/src/decorators/model.decorator.ts#L54-L75

Later on, DefaultCrudRepository is picking up model definition from the definition property:

https://github.com/strongloop/loopback-next/blob/ba138cc5b93bdd353d583006ca6cf9722f9d6e25/packages/repository/src/repositories/legacy-juggler-bridge.ts#L80-L84

6

Can we provide getId() implementation in the auto-generated model base class? Users can always override it in their model class if needed. I expect very few users will need to do that, in which case I find it better to hide the getId() complexity from them.

7

Regarding EJS usage for code generation: isn't this the situation well-known from server-rendered HTML pages? When there is a single piece of code responsible for processing data and building the string output (HTML page or TypeScript source), this code becomes difficult to understand, test and modify. One of the recommended solutions is Model-View-Controller design, where the code responsible for data manipulation (Model/Controller) is decoupled from the code responsible for rendering (View). The rendering code should have as little logic as possible.

IMO, we should leverage the same pattern in CLI too:

  • Have a piece of code that builds any data needed to render the target TypeScript file, including transformations like changing case of identifiers (productReview vs. ProductReview), etc.
  • Use an EJS template to render the TypeScript output.

In my experience from loopback-sdk-angular, EJS is well suited for generating JavaScript/TypeScript code, because the templating syntax does not interfere/conflict with JS/TS syntax.

8

Formatting of generated code: it would be great if our tooling was able to pick up prettier version and configuration from the project we are running in and format the generated code accordingly.

Imagine a user is using prettier in their project (as scaffolded by lb4 app), but changed prettier config to use double-quotes and tabs. If we don't honor their style, they will have to run npm run lint:fix every time they add a new model or update auto-generated TypeScript definitions.

It is also important to run prettier from node_modules of the target project, because different prettier version have variations in how they format the same piece of code and we cannot users to have exactly the same version of prettier as we use in LB4 tooling.

9

Regarding missing model base classes: I would personally keep these generated TypeScript files versioned in git. It will make it easier to read source code on GitHub, where auto-generated source files are not available, and also to review code changes made by our codegen tooling.

I also like the idea of moving generated files out of sight into a their own directory (models/generated).

10

What base model/entity classes to offer in lb4 model prompt? Beside the built-in class Entity, we should offer all existing app-defined model classes too. Consider the following model hierarchy:

  • BaseUser contains properties and methods common to all persons registered in the app, e.g. email and password.
  • Customer extends BaseUser and adds data and behavior specific to people making purchases, e.g. shipping address and "hasMany orders" relation.
  • Admin extends BaseUser and adds data and behavior specific to people administering our system, e.g. department code and "hasAndBelongsToMany products" relation.

When creating Customer model, I'd like to pick BaseUser in the CLI prompt.

IIRC, LB 3.x CLI offers app-defined models too.

11

Where does regenerate belong to? IMO, neither lb-tsc nor lb4 CLI are good places.

  • The primary goal of lb-tsc is to compile TypeScript files. By adding regenerate option, we will make the tool unnecessarily complex. In my experience, it's better to have many small tools (scripts), each doing one thing well (the Unix philosophy).
  • lb4 CLI is typically installed globally. Invoking globally-installed modules from project build has been considered as anti-pattern for quite some time now. If we add lb4 CLI into dev-dependencies of every project, then we will unnecessarily increase the size of project's dev-dependencies.

Ideally, I'd like to see

  • A new script that's either bundled in @loopback/build or in a new standalone module. A standalone module offers the advantage of independent versioning - a breaking change in regenerate output does not have to bump up semver-major of all build tooling.
  • lb4 CLI should try to load regenerate version installed in the target project, to ensure we are generating the same code as when regenerate is invoked from project's build step. As a fallback, we can bundle our own regenerate version inside lb4 CLI. In which case we most likely don't want to bundle entire @loopback/build (which includes prettier, typescript, etc.) as a dependency of @loopback/cli, and the script needs to live in a different package.

12

The presentation mentioned few helper commands:

  • lb4 test — builds (includes regenerating) and tests the current LB4 application
  • lb4 start — builds (includes regenerating) and starts the current LB4 application
  • lb4 regenerate — just regenerate artifacts and index.ts files but no build
  • lb4 build — builds (includes regenerating) of current LB4 application

I consider this as a bad idea:

  • The regenerate step should be configured as part of project build (pretest invokes build, prebuild invokes regenerate). We need this to allowgit clone && npm install && npm test workflow, I would not want to install lb4 CLI on our build machines.
  • These helpers don't save that much typing (npm test vs. lb4 test, npm start vs lb4 start).
  • To me, the added value is very small compared to the extra complexity of implementing and maintaining these helpers. (Consider e.g. tsc --watch and mocha --watch that we would have to somehow support too.)

@shimks
Copy link
Contributor

shimks commented May 29, 2018

@virkt25 Would you consider this spike done? I remember there being follow up issues that can be estimated.

@virkt25
Copy link
Contributor

virkt25 commented Jun 1, 2018

Yea I would consider this spike done. Leaving this issue open till I have time to transfer @bajtos 's comments to issues tracking the follow-up work.


The outcomes is to use CLI to generate a JSON file which can be edited by users / CLI tooling. The CLI will also generate a .ts file with the class for typing purposes. Also gives a programatic way to modify the class. The .ts file is generated using an ejs template. Some complex artifacts may need a "base" class and an "actual" class to be generated. Base classes will live in a base / generated folder and should not be modified by a user. The "actual" class extending the base is where the user can make modifications to their code for programatic overrides. These complex artifacts will need to have their base classes re-generated when the underlying JSON is modified -- we will provide a prebuild script to do this for the user each time the application is built so the generated classes are always based on the latest JSON. generated folder should be cleaned when a user cleans the project.

@bajtos
Copy link
Member

bajtos commented Jul 23, 2018

Yea I would consider this spike done. Leaving this issue open till I have time to transfer @bajtos 's comments to issues tracking the follow-up work.

@virkt25 have you managed to transfer the comments to the appropriate issues? Can we close this spike story as done?

@bajtos
Copy link
Member

bajtos commented Jul 30, 2018

Yea I would consider this spike done. Leaving this issue open till I have time to transfer @bajtos 's comments to issues tracking the follow-up work.

Have you managed to transfer the comments to the appropriate issues? Can we close this spike story as done?

@virkt25 ping ☝️

@virkt25
Copy link
Contributor

virkt25 commented Aug 15, 2018

Created #1609 and #1610 to track items from @bajtos 's comment that need to be done still / explored. All others have either been addressed / won't be pursued at this time (ex: helper commands).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants