Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(backend): enhance config loader with embedding support and impro… #74

Merged
merged 20 commits into from
Jan 6, 2025
Merged
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
78a3545
feat(backend): enhance config loader with embedding support and impro…
Sma1lboy Dec 16, 2024
72c6dbc
feat(backend): implement ModelStatusManager for tracking model downlo…
Sma1lboy Dec 16, 2024
1340532
chore: adding uitls ufc
Sma1lboy Dec 16, 2024
0ee6b84
Merge branch 'main' of https://github.com/Sma1lboy/codefox into feat-…
NarwhalChen Dec 25, 2024
0b412b0
feat: adding embedding provider and openai-embedding provider
NarwhalChen Dec 25, 2024
b378185
[autofix.ci] apply automated fixes
autofix-ci[bot] Dec 25, 2024
4c66d05
feat: implmenting dynamic loader for embedding and models in hugging …
NarwhalChen Dec 28, 2024
ad74c49
Merge branch 'feat-adding-embedding-config' of https://github.com/Sma…
NarwhalChen Dec 28, 2024
97cae53
Merge branch 'main' into feat-adding-embedding-config
Sma1lboy Dec 28, 2024
e579146
feat(backend): refactor model downloading and configuration handling
Sma1lboy Dec 28, 2024
8b25760
Merge branch 'feat-adding-embedding-config' of https://github.com/Sma…
NarwhalChen Dec 30, 2024
b7a8b30
to: using fastemb to implement embedding downloader
NarwhalChen Dec 30, 2024
7e68b3a
to: moving embedding provider to backend
NarwhalChen Dec 30, 2024
2c2027d
[autofix.ci] apply automated fixes
autofix-ci[bot] Dec 30, 2024
fb9be7d
feat: request llm-server with openai/some embedding provider
NarwhalChen Dec 30, 2024
0b85081
Merge branch 'feat-adding-embedding-config' of https://github.com/Sma…
NarwhalChen Dec 30, 2024
6ad8b9b
to: adding return type and comments for generateEmbResponse
NarwhalChen Dec 31, 2024
56c6ffc
fix: fixing the change of env variable
NarwhalChen Jan 6, 2025
b9c1909
fix: deleting useless embedding index.ts
NarwhalChen Jan 6, 2025
50d3bd1
Merge branch 'main' into feat-adding-embedding-config
ZHallen122 Jan 6, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
import { EmbeddingProvider } from "src/common/embedding-provider";

describe('Model Provider Test', () => {
let embProvider = EmbeddingProvider.getInstance();
it('should generate a response from the model provider', async () => {
let res = await embProvider.generateEmbResponse("Your text string goes here", "text-embedding-3-small");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Mismatch between test usage and interface definition.

The generateEmbResponse method is invoked with two string arguments here, but according to EmbeddingProvider, it requires (GenerateMessageParams, Response). This discrepancy can cause compile-time or runtime errors and indicates that the test may not actually be testing the real interface.

- let res = await embProvider.generateEmbResponse("Your text string goes here", "text-embedding-3-small");
+ // Example fix: if the provider truly expects (params, res)
+ // either mock 'res' or adapt the test to match your real usage:
+ const params = { model: "text-embedding-3-small", message: "Your text string goes here" };
+ let mockRes = makeMockResponse(); // pseudo-code
+ let result = await embProvider.generateEmbResponse(params, mockRes);

Committable suggestion skipped: line range outside the PR's diff.

console.log(res);
});
});
49 changes: 49 additions & 0 deletions backend/src/common/embedding-provider/index.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
import { HttpService } from "@nestjs/axios";
import { Logger } from "@nestjs/common";
import { rejects } from "assert";
import { resolve } from "path";

export interface ModelProviderConfig {
endpoint: string;
defaultModel?: string;
}

export class EmbeddingProvider {
private static instance: EmbeddingProvider | undefined = undefined;
private logger = new Logger(EmbeddingProvider.name);
constructor(private readonly httpService: HttpService, private readonly config: ModelProviderConfig) {}

public static getInstance() {
if(!this.instance){
this.instance = new EmbeddingProvider(new HttpService(), {
endpoint: 'http://localhost:3001',
});
}
return this.instance;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Replace 'this' references in static context with the class name.

Accessing this.instance inside a static method can be unclear. Per lint/complexity rules, prefer using the class name explicitly for improved readability and maintainability.

 public static getInstance() {
-    if (!this.instance) {
-        this.instance = new EmbeddingProvider(new HttpService(), {
+    if (!EmbeddingProvider.instance) {
+        EmbeddingProvider.instance = new EmbeddingProvider(new HttpService(), {
             endpoint: 'http://localhost:3001',
         });
-    }
-    return this.instance;
+    }
+    return EmbeddingProvider.instance;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
public static getInstance() {
if(!this.instance){
this.instance = new EmbeddingProvider(new HttpService(), {
endpoint: 'http://localhost:3001',
});
}
return this.instance;
}
public static getInstance() {
if (!EmbeddingProvider.instance) {
EmbeddingProvider.instance = new EmbeddingProvider(new HttpService(), {
endpoint: 'http://localhost:3001',
});
}
return EmbeddingProvider.instance;
}
🧰 Tools
🪛 Biome (1.9.4)

[error] 17-17: Using this in a static context can be confusing.

this refers to the class.
Unsafe fix: Use the class name instead.

(lint/complexity/noThisInStatic)


[error] 18-18: Using this in a static context can be confusing.

this refers to the class.
Unsafe fix: Use the class name instead.

(lint/complexity/noThisInStatic)


[error] 22-22: Using this in a static context can be confusing.

this refers to the class.
Unsafe fix: Use the class name instead.

(lint/complexity/noThisInStatic)


// return embedding array
public async generateEmbResponse(input: string, model: string): Promise<any[] | null>{
try{
const res = this.httpService.post(`${this.config.endpoint}/embedding`, {
content: input,
model: model,
});
return await new Promise((resolve, rejects) => {
res.subscribe({
next: (value) => {
console.log(value.data.embedding);
resolve(value.data.embedding);
},
error: (error) => {
this.logger.error(error);
rejects(null);
}
})
})
} catch (error) {
this.logger.error(error);
return null;
}
}
}
24 changes: 19 additions & 5 deletions backend/src/config/common-path.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import * as path from 'path';
import { existsSync, mkdirSync, promises } from 'fs-extra';
import { existsSync, mkdirSync, promises, writeFileSync } from 'fs-extra';
import { ConfigType } from '@nestjs/config';

// Constants for base directories
const APP_NAME = 'codefox';
Expand Down Expand Up @@ -27,17 +28,30 @@ const ensureDir = (dirPath: string): string => {
export const getRootDir = (): string => ensureDir(ROOT_DIR);

// Configuration Paths
export const getConfigDir = (): string =>
ensureDir(path.join(getRootDir(), 'config'));
export const getConfigPath = (configName: string): string =>
path.join(getConfigDir(), `${configName}.json`);
export const getConfigPath = (): string => {
const rootPath = ensureDir(path.join(getRootDir()));
return path.join(rootPath, 'config.json');
};

export const getModelStatusPath = (): string => {
const rootPath = ensureDir(getRootDir());
const modelStatusPath = path.join(rootPath, `model-status.json`);
writeFileSync(modelStatusPath, '{}');
return modelStatusPath;
};

// Models Directory
export const getModelsDir = (): string =>
ensureDir(path.join(getRootDir(), 'models'));
export const getModelPath = (modelName: string): string =>
path.join(getModelsDir(), modelName);

// Embs Directory
export const getEmbDir = (): string =>
ensureDir(path.join(getRootDir(), 'embeddings'));
export const getEmbPath = (modelName: string): string =>
path.join(getModelsDir(), modelName);

Comment on lines +51 to +56
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Use getEmbDir() instead of getModelsDir() in getEmbPath to avoid confusion.
Currently, getEmbPath points to the main models directory rather than the dedicated embeddings directory, which might lead to unintended overwriting or mixing of files.

 export const getEmbPath = (modelName: string): string =>
-  path.join(getModelsDir(), modelName);
+  path.join(getEmbDir(), modelName);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Embs Directory
export const getEmbDir = (): string =>
ensureDir(path.join(getRootDir(), 'embeddings'));
export const getEmbPath = (modelName: string): string =>
path.join(getModelsDir(), modelName);
// Embs Directory
export const getEmbDir = (): string =>
ensureDir(path.join(getRootDir(), 'embeddings'));
export const getEmbPath = (modelName: string): string =>
path.join(getEmbDir(), modelName);

// Project-Specific Paths
export const getProjectsDir = (): string =>
ensureDir(path.join(getRootDir(), 'projects'));
Expand Down
228 changes: 213 additions & 15 deletions backend/src/config/config-loader.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,54 +2,252 @@ import * as fs from 'fs';
import * as path from 'path';
import * as _ from 'lodash';
import { getConfigPath } from './common-path';
export interface ChatConfig {
import { ConfigType } from 'src/downloader/universal-utils';
import { Logger } from '@nestjs/common';

export interface ModelConfig {
model: string;
endpoint?: string;
token?: string;
default?: boolean;
task?: string;
}

export class ConfigLoader {
private chatsConfig: ChatConfig[];
export interface EmbeddingConfig {
model: string;
endpoint?: string;
default?: boolean;
token?: string;
}

export interface AppConfig {
models?: ModelConfig[];
embeddings?: EmbeddingConfig[];
}

export const exampleConfigContent = `{
// Chat models configuration
// You can configure multiple chat models
"models": [
// Example of OpenAI GPT configuration
{
"model": "gpt-3.5-turbo",
"endpoint": "https://api.openai.com/v1",
"token": "your-openai-token", // Replace with your OpenAI token
"default": true // Set as default chat model
},

// Example of local model configuration
{
"model": "llama2",
"endpoint": "http://localhost:11434/v1"
}
],

// Embedding model configuration (optional)
"embeddings": [{
"model": "text-embedding-ada-002",
"endpoint": "https://api.openai.com/v1",
"token": "your-openai-token", // Replace with your OpenAI token
"default": true // Set as default embedding
}]
}`;

export class ConfigLoader {
readonly logger = new Logger(ConfigLoader.name);
private type: string;
private static instances: Map<ConfigType, ConfigLoader> = new Map();
private static config: AppConfig;
private readonly configPath: string;

constructor() {
this.configPath = getConfigPath('config');
private constructor(type: ConfigType) {
this.type = type;
this.configPath = getConfigPath();
this.initConfigFile();
this.loadConfig();
}

public static getInstance(type: ConfigType): ConfigLoader {
if (!ConfigLoader.instances.has(type)) {
ConfigLoader.instances.set(type, new ConfigLoader(type));
}
return ConfigLoader.instances.get(type)!;
}

public initConfigFile(): void {
Logger.log('Creating example config file', 'ConfigLoader');

const config = getConfigPath();
if (fs.existsSync(config)) {
return;
}

if (!fs.existsSync(config)) {
//make file
fs.writeFileSync(config, exampleConfigContent, 'utf-8');
}
Logger.log('Creating example config file', 'ConfigLoader');
}

public reload(): void {
this.loadConfig();
}

private loadConfig() {
const file = fs.readFileSync(this.configPath, 'utf-8');
try {
Logger.log(
`Loading configuration from ${this.configPath}`,
'ConfigLoader',
);
const file = fs.readFileSync(this.configPath, 'utf-8');
const jsonContent = file.replace(
/\\"|"(?:\\"|[^"])*"|(\/\/.*|\/\*[\s\S]*?\*\/)/g,
(m, g) => (g ? '' : m),
);
ConfigLoader.config = JSON.parse(jsonContent);
this.validateConfig();
} catch (error) {
if (
error.code === 'ENOENT' ||
error.message.includes('Unexpected end of JSON input')
) {
ConfigLoader.config = {};
this.saveConfig();
} else {
throw error;
}
}

this.chatsConfig = JSON.parse(file);
console.log('Raw file content:', this.chatsConfig);
this.logger.log(ConfigLoader.config);
}

get<T>(path: string) {
get<T>(path?: string): T {
if (!path) {
return this.chatsConfig as unknown as T;
return ConfigLoader.config as unknown as T;
}
return _.get(this.chatsConfig, path) as T;
return _.get(ConfigLoader.config, path) as T;
}

set(path: string, value: any) {
_.set(this.chatsConfig, path, value);
_.set(ConfigLoader.config, path, value);
this.saveConfig();
}

private saveConfig() {
const configDir = path.dirname(this.configPath);
if (!fs.existsSync(configDir)) {
fs.mkdirSync(configDir, { recursive: true });
}
fs.writeFileSync(
this.configPath,
JSON.stringify(this.chatsConfig, null, 4),
JSON.stringify(ConfigLoader.config, null, 2),
'utf-8',
);
}

addConfig(config: ModelConfig | EmbeddingConfig) {
if (!ConfigLoader.config[this.type]) {
ConfigLoader.config[this.type] = [];
}
this.logger.log(ConfigLoader.config);
const index = ConfigLoader.config[this.type].findIndex(
(chat) => chat.model === config.model,
);
if (index !== -1) {
ConfigLoader.config[this.type].splice(index, 1);
}

if (config.default) {
ConfigLoader.config.models.forEach((chat) => {
chat.default = false;
});
}

ConfigLoader.config[this.type].push(config);
this.saveConfig();
}
Comment on lines +148 to +168
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Potential mix-up with this.type and models.
When adding a default embedding, the loop explicitly sets ConfigLoader.config.models.forEach(...) to disable existing defaults. This might unintentionally alter models when this.type is ConfigType.EMBEDDINGS.

+ // If adding an embedding config, ensure we only reset defaults among embeddings
 if (config.default) {
-  ConfigLoader.config.models.forEach((chat) => {
+  (ConfigLoader.config[this.type] as (ModelConfig | EmbeddingConfig)[]).forEach((c) => {
     c.default = false;
   });
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
addConfig(config: ModelConfig | EmbeddingConfig) {
if (!ConfigLoader.config[this.type]) {
ConfigLoader.config[this.type] = [];
}
this.logger.log(ConfigLoader.config);
const index = ConfigLoader.config[this.type].findIndex(
(chat) => chat.model === config.model,
);
if (index !== -1) {
ConfigLoader.config[this.type].splice(index, 1);
}
if (config.default) {
ConfigLoader.config.models.forEach((chat) => {
chat.default = false;
});
}
ConfigLoader.config[this.type].push(config);
this.saveConfig();
}
addConfig(config: ModelConfig | EmbeddingConfig) {
if (!ConfigLoader.config[this.type]) {
ConfigLoader.config[this.type] = [];
}
this.logger.log(ConfigLoader.config);
const index = ConfigLoader.config[this.type].findIndex(
(chat) => chat.model === config.model,
);
if (index !== -1) {
ConfigLoader.config[this.type].splice(index, 1);
}
if (config.default) {
(ConfigLoader.config[this.type] as (ModelConfig | EmbeddingConfig)[]).forEach((c) => {
c.default = false;
});
}
ConfigLoader.config[this.type].push(config);
this.saveConfig();
}


removeConfig(modelName: string): boolean {
if (!ConfigLoader.config[this.type]) {
return false;
}

const initialLength = ConfigLoader.config[this.type].length;
ConfigLoader.config.models = ConfigLoader.config[this.type].filter(
(chat) => chat.model !== modelName,
);

if (ConfigLoader.config[this.type].length !== initialLength) {
this.saveConfig();
return true;
}

return false;
}
Comment on lines +170 to +186
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Inconsistent usage of this.type in removeConfig.
Updating ConfigLoader.config.models even if this.type is 'embeddings' poses a risk of accidentally removing clone items from models.

- ConfigLoader.config.models = ConfigLoader.config[this.type].filter(
-   (chat) => chat.model !== modelName,
- );
+ ConfigLoader.config[this.type] = ConfigLoader.config[this.type].filter(
+   (chat: ModelConfig | EmbeddingConfig) => chat.model !== modelName,
+ );
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
removeConfig(modelName: string): boolean {
if (!ConfigLoader.config[this.type]) {
return false;
}
const initialLength = ConfigLoader.config[this.type].length;
ConfigLoader.config.models = ConfigLoader.config[this.type].filter(
(chat) => chat.model !== modelName,
);
if (ConfigLoader.config[this.type].length !== initialLength) {
this.saveConfig();
return true;
}
return false;
}
removeConfig(modelName: string): boolean {
if (!ConfigLoader.config[this.type]) {
return false;
}
const initialLength = ConfigLoader.config[this.type].length;
ConfigLoader.config[this.type] = ConfigLoader.config[this.type].filter(
(chat: ModelConfig | EmbeddingConfig) => chat.model !== modelName,
);
if (ConfigLoader.config[this.type].length !== initialLength) {
this.saveConfig();
return true;
}
return false;
}


getAllConfigs(): EmbeddingConfig[] | ModelConfig[] | null {
const res = ConfigLoader.config[this.type];
return Array.isArray(res) ? res : null;
}

validateConfig() {
if (!this.chatsConfig) {
throw new Error("Invalid configuration: 'chats' section is missing.");
if (!ConfigLoader.config) {
ConfigLoader.config = {};
}

if (typeof ConfigLoader.config !== 'object') {
throw new Error('Invalid configuration: Must be an object');
}

if (ConfigLoader.config.models) {
if (!Array.isArray(ConfigLoader.config.models)) {
throw new Error("Invalid configuration: 'chats' must be an array");
}

ConfigLoader.config.models.forEach((chat, index) => {
if (!chat.model) {
throw new Error(
`Invalid chat configuration at index ${index}: 'model' is required`,
);
}
});

const defaultChats = ConfigLoader.config.models.filter(
(chat) => chat.default,
);
if (defaultChats.length > 1) {
throw new Error(
'Invalid configuration: Multiple default chat configurations found',
);
}
}

if (ConfigLoader.config[ConfigType.EMBEDDINGS]) {
this.logger.log(ConfigLoader.config[ConfigType.EMBEDDINGS]);
if (!Array.isArray(ConfigLoader.config[ConfigType.EMBEDDINGS])) {
throw new Error("Invalid configuration: 'embeddings' must be an array");
}

ConfigLoader.config.models.forEach((emb, index) => {
if (!emb.model) {
throw new Error(
`Invalid chat configuration at index ${index}: 'model' is required`,
);
}
});

const defaultChats = ConfigLoader.config[ConfigType.EMBEDDINGS].filter(
(chat) => chat.default,
);
if (defaultChats.length > 1) {
throw new Error(
'Invalid configuration: Multiple default emb configurations found',
);
}
}
}

getConfig(): AppConfig {
return ConfigLoader.config;
}
}
Loading
Loading