Skip to content

Commit

Permalink
Parse input with parse5
Browse files Browse the repository at this point in the history
  • Loading branch information
kevin-on committed Oct 12, 2024
1 parent d1cfbcb commit eb92ea8
Show file tree
Hide file tree
Showing 5 changed files with 95 additions and 41 deletions.
23 changes: 23 additions & 0 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@
"lodash.isequal": "^4.5.0",
"lucide-react": "^0.447.0",
"openai": "^4.65.0",
"parse5": "^7.1.2",
"react": "^18.3.1",
"react-dom": "^18.3.1",
"react-markdown": "^9.0.1",
Expand Down
84 changes: 57 additions & 27 deletions src/components/ReactMarkdown.tsx
Original file line number Diff line number Diff line change
@@ -1,55 +1,85 @@
import { parseFragment } from 'parse5'
import Markdown from 'react-markdown'

import MarkdownCodeComponent from './MarkdownCodeComponent'

function parsesmtcmpBlocks(input: string): (
| { type: 'string'; content: string }
| {
type: 'smtcmpBlock'
type: 'smtcmp_block'
content: string
language?: string
filename?: string
}
)[] {
const regex = /<smtcmpBlock([^>]*)>\s*([\s\S]*?)\s*(?:<\/smtcmpBlock>|$)/g
const matches = input.matchAll(regex)
const result: (
const parsedResult: (
| { type: 'string'; content: string }
| {
type: 'smtcmpBlock'
type: 'smtcmp_block'
content: string
language?: string
filename?: string
}
)[] = []

let lastIndex = 0
for (const match of matches) {
if (match.index > lastIndex) {
result.push({
type: 'string',
content: input.slice(lastIndex, match.index),
})
const fragment = parseFragment(input, {
sourceCodeLocationInfo: true,
})
let lastEndOffset = 0
for (const node of fragment.childNodes) {
if (node.nodeName === 'smtcmp_block') {
if (!node.sourceCodeLocation) {
throw new Error('sourceCodeLocation is undefined')
}
const startOffset = node.sourceCodeLocation.startOffset
const endOffset = node.sourceCodeLocation.endOffset
if (startOffset > lastEndOffset) {
parsedResult.push({
type: 'string',
content: input.slice(lastEndOffset, startOffset),
})
}

const language = node.attrs.find(
(attr) => attr.name === 'language',
)?.value
const filename = node.attrs.find(
(attr) => attr.name === 'filename',
)?.value

const children = node.childNodes
if (children.length === 0) {
parsedResult.push({
type: 'smtcmp_block',
content: '',
language,
filename,
})
} else {
const innerContentStartOffset =
children[0].sourceCodeLocation?.startOffset
const innerContentEndOffset =
children[children.length - 1].sourceCodeLocation?.endOffset
if (!innerContentStartOffset || !innerContentEndOffset) {
throw new Error('sourceCodeLocation is undefined')
}
parsedResult.push({
type: 'smtcmp_block',
content: input.slice(innerContentStartOffset, innerContentEndOffset),
language,
filename,
})
}
lastEndOffset = endOffset
}
const [, attributes, content] = match
const language = attributes.match(/language="([^"]+)"/)?.[1]
const filename = attributes.match(/filename="([^"]+)"/)?.[1]
result.push({
type: 'smtcmpBlock',
content,
language,
filename,
})
lastIndex = match.index + match[0].length
}
if (lastIndex < input.length) {
result.push({
if (lastEndOffset < input.length) {
parsedResult.push({
type: 'string',
content: input.slice(lastIndex),
content: input.slice(lastEndOffset),
})
}

return result
return parsedResult
}

export default function ReactMarkdown({
Expand Down
10 changes: 5 additions & 5 deletions src/utils/apply.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,14 @@ const systemPrompt = `You are an intelligent assistant helping a user apply chan
You will receive:
1. The content of the target markdown file.
2. A conversation history between the user and the assistant. This conversation may contain multiple markdown blocks suggesting changes to the file. Markdown blocks are indicated by the <smtcmpBlock> tag. For example:
<smtcmpBlock>
2. A conversation history between the user and the assistant. This conversation may contain multiple markdown blocks suggesting changes to the file. Markdown blocks are indicated by the <smtcmp_block> tag. For example:
<smtcmp_block>
<!-- ... existing content ... -->
{{ edit_1 }}
<!-- ... existing content ... -->
{{ edit_2 }}
<!-- ... existing content ... -->
</smtcmpBlock>
</smtcmp_block>
3. A single, specific markdown block extracted from the conversation history. This block contains the exact changes that should be applied to the target file.
Please rewrite the entire markdown file with ONLY the changes from the specified markdown block applied. DO NOT apply changes suggested by other parts of the conversation. Preserve all parts of the original file that are not related to the changes. Output only the file content, without any additional words or explanations.`
Expand Down Expand Up @@ -72,9 +72,9 @@ ${chatMessages
## Changes to Apply
Here is the markdown block that indicates where content changes should be applied.
<smtcmpBlock>
<smtcmp_block>
${blockToApply}
</smtcmpBlock>
</smtcmp_block>
Now rewrite the entire file with the changes applied. Immediately start your response with \`\`\`${currentFile.path}`
}
Expand Down
18 changes: 9 additions & 9 deletions src/utils/prompt.ts
Original file line number Diff line number Diff line change
Expand Up @@ -67,14 +67,14 @@ export const parseRequestMessages = async (
1. Please keep your response as concise as possible. Avoid being verbose.
2. When the user is asking for edits to their markdown, please provide a simplified version of the markdown block emphasizing only the changes. Use comments to show where unchanged content has been skipped. Wrap the markdown block with <smtcmpBlock> tags. Add filename and language attributes to the <smtcmpBlock> tags. For example:
<smtcmpBlock filename="path/to/file.md" language="markdown">
2. When the user is asking for edits to their markdown, please provide a simplified version of the markdown block emphasizing only the changes. Use comments to show where unchanged content has been skipped. Wrap the markdown block with <smtcmp_block> tags. Add filename and language attributes to the <smtcmp_block> tags. For example:
<smtcmp_block filename="path/to/file.md" language="markdown">
<!-- ... existing content ... -->
{{ edit_1 }}
<!-- ... existing content ... -->
{{ edit_2 }}
<!-- ... existing content ... -->
</smtcmpBlock>
</smtcmp_block>
The user has full access to the file, so they prefer seeing only the changes in the markdown. Often this will mean that the start/end of the file will be skipped, but that's okay! Rewrite the entire file only if specifically requested. Always provide a brief explanation of the updates, except when the user specifically asks for just the content.
3. Do not lie or make up facts.
Expand All @@ -83,18 +83,18 @@ The user has full access to the file, so they prefer seeing only the changes in
5. Format your response in markdown.
6. When writing out new markdown blocks, also wrap them with <smtcmpBlock> tags. For example:
<smtcmpBlock language="markdown">
6. When writing out new markdown blocks, also wrap them with <smtcmp_block> tags. For example:
<smtcmp_block language="markdown">
{{ content }}
</smtcmpBlock>
</smtcmp_block>
7. When providing markdown blocks for an existing file, add the filename and language attributes to the <smtcmpBlock> tags. Restate the relevant section or heading, so the user knows which part of the file you are editing. For example:
<smtcmpBlock filename="path/to/file.md" language="markdown">
7. When providing markdown blocks for an existing file, add the filename and language attributes to the <smtcmp_block> tags. Restate the relevant section or heading, so the user knows which part of the file you are editing. For example:
<smtcmp_block filename="path/to/file.md" language="markdown">
## Section Title
...
{{ content }}
...
</smtcmpBlock>`,
</smtcmp_block>`,
}

const currentFile = lastUserMessage.mentionables.find(
Expand Down

0 comments on commit eb92ea8

Please sign in to comment.