-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: generativeaionvertexai_embedding_batch #3913
base: main
Are you sure you want to change the base?
Conversation
b5e809a
to
fc64882
Compare
Here is the summary of changes. You are about to add 1 region tag.
This comment is generated by snippet-bot.
|
fc64882
to
35afe3e
Compare
35afe3e
to
32eb6c9
Compare
@gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pull request introduces a new snippet and its corresponding test for creating batch embeddings using the Vertex AI API. The code is well-structured and follows the project's contribution guidelines. However, there are a few areas that could be improved for clarity, maintainability, and best practices.
/** | ||
* TODO(developer): Uncomment/update these variables before running the sample. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The TODO comment asks the developer to uncomment/update these variables. It would be helpful to provide more specific instructions or examples of how these variables should be updated, including valid formats and constraints. For instance, specify the expected format for inputUri
and outputUri
, and whether jobName
can contain special characters.
/** | |
* TODO(developer): Uncomment/update these variables before running the sample. | |
*/ | |
/** | |
* TODO(developer): Uncomment and replace the following placeholders before running the sample. | |
* - projectId: Replace with your Google Cloud project ID. | |
* - inputUri: Replace with the URI of the input dataset. This can be a BigQuery table or a Google Cloud Storage file. | |
* Examples: "gs://[BUCKET]/[DATASET].jsonl" or "bq://[PROJECT].[DATASET].[TABLE]". | |
* - outputUri: Replace with the URI where the output will be stored. This can be a BigQuery table or a Google Cloud Storage file. | |
* Examples: "gs://[BUCKET]/[OUTPUT].jsonl" or "bq://[PROJECT].[DATASET].[TABLE]". | |
* - jobName: Replace with a descriptive name for the batch embedding job. Avoid special characters. | |
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't follow our style guidelines and would make sample management unwieldy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand your concern about style guidelines. Could you please clarify which specific aspects of the suggested changes violate your style guide? Knowing the specific rules will help me refine the suggestion to better align with your team's preferences. For example, are there length restrictions on comments, preferred ways to format URIs, or specific naming conventions for variables that are being violated?
const textEmbeddingModel = 'text-embedding-005'; | ||
const location = 'us-central1'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Create batch prediction job request | ||
const [response] = await jobServiceClient.createBatchPredictionJob(request); | ||
|
||
console.log('Raw response: ', JSON.stringify(response, null, 2)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of just logging the raw response, consider logging specific fields that are relevant to the user, such as the job ID, status, and output URI. This would make it easier for the user to track the job's progress and access the results.
console.log('Raw response: ', JSON.stringify(response, null, 2)); | |
console.log('Batch prediction job created:', response.name); | |
console.log('Job ID:', response.name.split('/').pop()); | |
console.log('Output URI:', response.outputConfig.gcsDestination.outputUriPrefix); |
const location = 'us-central1'; | ||
const inputUri = | ||
'gs://cloud-samples-data/generative-ai/embeddings/embeddings_input.jsonl'; | ||
let outputUri = 'gs://ucaip-samples-test-output/'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
before(async () => { | ||
const bucketName = `test-bucket-${uuid()}`; | ||
// Create a Google Cloud Storage bucket for UsageReports |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
return jobServiceClient.deleteBatchPredictionJob(deleteRequest); | ||
}); | ||
// Delete the Google Cloud Storage bucket created for usage reports. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
111148a
to
d66ec9d
Compare
d66ec9d
to
b2042bf
Compare
b2042bf
to
1c00638
Compare
Description
Fixes #
Note: Before submitting a pull request, please open an issue for discussion if you are not associated with Google.
Checklist
npm test
(see Testing)npm run lint
(see Style)GoogleCloudPlatform/nodejs-docs-samples
. Not a fork.