Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Смышляев Дмитрий #231

Open
wants to merge 83 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 31 commits
Commits
Show all changes
83 commits
Select commit Hold shift + click to select a range
e26d15e
Создал новый проект
vafle228 Nov 17, 2024
459f8f4
Описал структуру классов для токенайзеров
vafle228 Nov 24, 2024
350c559
Добавил парсеры
vafle228 Nov 24, 2024
4a2abdb
Добавил генератор html кода
vafle228 Nov 24, 2024
7889590
Реализация алгоритма
vafle228 Nov 24, 2024
8a31839
Merge pull request #1 from vafle228/architecture
vafle228 Nov 24, 2024
9900175
Пока не в тдд (забыл)
vafle228 Nov 26, 2024
65fc929
Довожу тесты 1.0
vafle228 Nov 26, 2024
9b750df
Довожу тесты 2.0
vafle228 Nov 26, 2024
7f664cd
Поправил токены
vafle228 Nov 26, 2024
c8fcea8
Добавил пробел как отдельный токен
vafle228 Nov 26, 2024
302aa2f
Новый сканер
vafle228 Nov 26, 2024
620fb70
Чуть поправил токены
vafle228 Nov 26, 2024
cbb6061
Довожу тесты 3.0
vafle228 Nov 26, 2024
110c4d2
Поправил ААА концепцию в тесте
vafle228 Nov 26, 2024
5408bec
Merge pull request #2 from vafle228/tokenizer
vafle228 Nov 26, 2024
2727bca
Добавил матч тулы
vafle228 Nov 26, 2024
7fc6935
Простейший италик
vafle228 Nov 26, 2024
8446c91
Улучшил апи паттерн билдера
vafle228 Nov 26, 2024
d22da48
Небольшой ренейминг
vafle228 Nov 26, 2024
d1d8413
Ввел паттерн как отдельную сущность
vafle228 Nov 26, 2024
5d586ba
Новые экстеншены
vafle228 Nov 26, 2024
fec4199
Переосмысление метча
vafle228 Nov 26, 2024
fdd4a28
Добавил правило для текста
vafle228 Dec 1, 2024
676fd39
Паттерн рул
vafle228 Dec 1, 2024
680e394
Обалдеть всего
vafle228 Dec 1, 2024
25288d6
Поддержка дабл андерскора
vafle228 Dec 1, 2024
df27307
Небольшая правка логики
vafle228 Dec 1, 2024
43b7116
Новые тесты
vafle228 Dec 1, 2024
3771ebd
Непарный тег тест
vafle228 Dec 1, 2024
647e1ff
Merge pull request #3 from vafle228/parser
vafle228 Dec 1, 2024
fac2a80
Реорганизация кода
vafle228 Dec 1, 2024
816efd4
Заготовки под жирный тег
vafle228 Dec 1, 2024
fec2be8
Болд тег
vafle228 Dec 2, 2024
7664345
Фикс италик тега
vafle228 Dec 2, 2024
d0ff733
Уточнил правило
vafle228 Dec 2, 2024
89f2a8f
Уточнил болд тег
vafle228 Dec 2, 2024
5ad9ed2
Подготовка к новыйм тегам
vafle228 Dec 2, 2024
945a4a9
Тег параграфа
vafle228 Dec 2, 2024
c33f29b
Движение к будущему
vafle228 Dec 3, 2024
fdd61ec
Новый бордер тег
vafle228 Dec 3, 2024
9cf741a
Нафиг эту бордер рулу вообщем
vafle228 Dec 3, 2024
d902ab6
Убрал тупую нал проверку
vafle228 Dec 3, 2024
69653bc
Отрефакторил параграф рул
vafle228 Dec 3, 2024
2aee82e
Отрефакторил хедлайн
vafle228 Dec 3, 2024
d36920c
Поделил ответсвенность тестов
vafle228 Dec 3, 2024
a4dae45
Обалдеть изменений
vafle228 Dec 3, 2024
34760cc
Тесты для специальных рулзов
vafle228 Dec 3, 2024
81b9907
Ор рул тесты
vafle228 Dec 3, 2024
05b2a1a
Тесты для кондитионал рулы
vafle228 Dec 8, 2024
992df7a
Крупное переосмысление
vafle228 Dec 8, 2024
88d2842
Переделка италик тега
vafle228 Dec 8, 2024
b3a605c
Переписал болд тег на новый лад
vafle228 Dec 8, 2024
de49690
Дофикс оставшихся тегов
vafle228 Dec 8, 2024
ed1043c
Избавился от лишних экстеншенов
vafle228 Dec 8, 2024
245b5c6
Тесты для звезды клини
vafle228 Dec 8, 2024
81b7300
Тесты для континюс рулы
vafle228 Dec 8, 2024
7650424
Тесты для заголовка
vafle228 Dec 8, 2024
03f7791
Ескейп тег
vafle228 Dec 9, 2024
764838a
Узнал про крутые строки
vafle228 Dec 9, 2024
c78f04d
Правки по ескейп тегу
vafle228 Dec 9, 2024
0b33846
Бади рул
vafle228 Dec 9, 2024
c946f8d
Merge pull request #4 from vafle228/parser
vafle228 Dec 9, 2024
2b6337a
Переместил тулзы на уровень выше
vafle228 Dec 9, 2024
af53027
Добавил интерфейс генератора
vafle228 Dec 9, 2024
36db829
Реализовал хтмл генератор
vafle228 Dec 9, 2024
7177671
Оч строгие тесты
vafle228 Dec 9, 2024
1bb8d65
Тесты на генератор хтмл
vafle228 Dec 10, 2024
de2fa80
Тест на скорость работы
vafle228 Dec 10, 2024
b1c321f
Починил бади тесты
vafle228 Dec 10, 2024
de913c7
Переписал токены на мемори спаны
vafle228 Dec 16, 2024
0cfad9e
Добавил уровни для хедлайна
vafle228 Dec 16, 2024
ff0d4c1
Уточнил болд и италик рулу
vafle228 Dec 16, 2024
06f5619
Merge pull request #5 from vafle228/generator
vafle228 Dec 16, 2024
1f3cd85
Хочу открыть мр
vafle228 Dec 16, 2024
e462015
Добавил новые токены
vafle228 Dec 16, 2024
6d1b752
Добавил href рулзу
vafle228 Dec 16, 2024
d9f6d32
Описал спецификацию
vafle228 Dec 16, 2024
1f5429a
Внедрил хрефу в параграф
vafle228 Dec 16, 2024
c36f843
Добавил хрефу в генератор хтмл
vafle228 Dec 16, 2024
6323061
Перенес спеку в маркадаун файл
vafle228 Dec 16, 2024
d51a496
Чуть прибрался
vafle228 Dec 16, 2024
b8c92f5
Merge pull request #6 from vafle228/custom-tag
vafle228 Dec 16, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions cs/Markdown/Generator/HTMLGenerator.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
using Markdown.Parser.Nodes;

namespace Markdown.Generator;

public class HTMLGenerator

This comment was marked as resolved.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Согласен: сделаю, как начну генераторы пилить )

{

public string GenerateHTML(Node astRoot)
{
/* Do magic with ast root */
return "<h1>Hello world</h1>";
}
}
10 changes: 10 additions & 0 deletions cs/Markdown/Markdown.csproj
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>net8.0</TargetFramework>
<ImplicitUsings>enable</ImplicitUsings>
<Nullable>enable</Nullable>
</PropertyGroup>

</Project>
7 changes: 7 additions & 0 deletions cs/Markdown/Parser/Nodes/Node.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
namespace Markdown.Parser.Nodes;

public class Node(NodeType nodeType, int consumed)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Опять Data class)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Не понимаю, за что отвечает int Consumed. Это не Length случайно?

{
public int Consumed { get; } = consumed;
public NodeType NodeType { get; } = nodeType;
}
8 changes: 8 additions & 0 deletions cs/Markdown/Parser/Nodes/NodeType.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
namespace Markdown.Parser.Nodes;

public enum NodeType
{
ITALIC,
BOLD,
TEXT
}
10 changes: 10 additions & 0 deletions cs/Markdown/Parser/Nodes/TagNode.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
namespace Markdown.Parser.Nodes;

public class TagNode(NodeType nodeType, List<Node> children, int consumed) : Node(nodeType, consumed)
{
public List<Node> Children { get; } = children;

public TagNode(NodeType nodeType, Node child, int consumed)
:this(nodeType, [child], consumed)
{ }
}
15 changes: 15 additions & 0 deletions cs/Markdown/Parser/Nodes/TextNode.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
using Markdown.Tokenizer.Tokens;

namespace Markdown.Parser.Nodes;

public class TextNode(int start, int consumed, List<Token> source) : Node(NodeType.TEXT, consumed)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

А вот тут точно нарушение SRP и всего вместе взятого (или я не понимаю). Почему сюда весь source закидывается?.. Почему не только нужные токены?

{
private readonly Lazy<Token> firstToken = new(source.Skip(start).First);
private readonly Lazy<Token> lastToken = new(source.Skip(start).Take(consumed).Last);
private readonly Lazy<List<Token>> tokens = new(source.Skip(start).Take(consumed).ToList);

public Token Last => lastToken.Value;
public Token First => firstToken.Value;
public List<Token> Tokens => tokens.Value;
public string Text => Tokens.ToText();
}
12 changes: 12 additions & 0 deletions cs/Markdown/Parser/Rules/BodyRule.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
using Markdown.Parser.Nodes;
using Markdown.Tokenizer.Tokens;

namespace Markdown.Parser.Rules;

public class BodyRule : IParsingRule
{
public Node? Match(List<Token> tokens, int begin = 0)
{
throw new NotImplementedException();
}
}
12 changes: 12 additions & 0 deletions cs/Markdown/Parser/Rules/BoldRule.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
using Markdown.Parser.Nodes;
using Markdown.Tokenizer.Tokens;

namespace Markdown.Parser.Rules;

public class BoldRule : IParsingRule
{
public Node? Match(List<Token> tokens, int begin = 0)
{
throw new NotImplementedException();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

А хде...

}
}
12 changes: 12 additions & 0 deletions cs/Markdown/Parser/Rules/HeadlineRule.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
using Markdown.Parser.Nodes;
using Markdown.Tokenizer.Tokens;

namespace Markdown.Parser.Rules;

public class HeadlineRule : IParsingRule
{
public Node? Match(List<Token> tokens, int begin = 0)
{
throw new NotImplementedException();
}
}
9 changes: 9 additions & 0 deletions cs/Markdown/Parser/Rules/IParsingRule.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
using Markdown.Parser.Nodes;
using Markdown.Tokenizer.Tokens;

namespace Markdown.Parser.Rules;

public interface IParsingRule
{
public Node? Match(List<Token> tokens, int begin = 0);
}
46 changes: 46 additions & 0 deletions cs/Markdown/Parser/Rules/ItalicRule.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
using Markdown.Parser.Nodes;
using Markdown.Parser.Rules.Tools;
using Markdown.Tokenizer.Tokens;

namespace Markdown.Parser.Rules;

public class ItalicRule : IParsingRule
{
private readonly List<IParsingRule> defaultPattern =
[
new PatternRule(TokenType.UNDERSCORE),
new TextRule(),
new PatternRule(TokenType.UNDERSCORE),
];

private readonly List<IParsingRule> innerTagPattern =
[
new PatternRule(TokenType.UNDERSCORE),
new PatternRule(TokenType.WORD),
new PatternRule(TokenType.UNDERSCORE),
];

public Node? Match(List<Token> tokens, int begin = 0)
{
var pattern = ChoosePattern(tokens, begin);
var match = tokens.MatchPattern(pattern, begin);

if (match.Count != pattern.Count) return null;
if (match.Second() is not TextNode textNode) return null;

var endWithWord = textNode.Last.TokenType == TokenType.WORD;
var startWithWord = textNode.First.TokenType == TokenType.WORD;

return startWithWord && endWithWord ? BuildNode(textNode) : null;
}

private List<IParsingRule> ChoosePattern(List<Token> tokens, int begin = 0)
{
if (begin != 0 && tokens[begin - 1].TokenType == TokenType.WORD)
return innerTagPattern;
return defaultPattern;
}

private static TagNode BuildNode(TextNode textNode)
=> new(NodeType.ITALIC, textNode, textNode.Consumed + 2);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Я бы не завязывался на + 2. В md есть разные форматы, поэтому лучше вынести в константу

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

И как будто есть смысл вынести такой Factory метод в сами ноды, как думаешь?

}
22 changes: 22 additions & 0 deletions cs/Markdown/Parser/Rules/PatternRule.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
using Markdown.Parser.Nodes;
using Markdown.Tokenizer.Tokens;

namespace Markdown.Parser.Rules;

public class PatternRule(List<TokenType> pattern) : IParsingRule
{
public PatternRule(TokenType tokenType)
: this([tokenType])
{ }

public Node? Match(List<Token> tokens, int begin = 0)
{
if (pattern.Count == 0) return null;
if (tokens.Count - begin < pattern.Count) return null;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: почему думаешь, что лучше вернуть null, чем выкинуть Exception?


var isMatched = tokens
.Skip(begin).Take(pattern.Count).Zip(pattern)
.All(pair => pair.First.TokenType == pair.Second);
return !isMatched ? null : new TextNode(begin, pattern.Count, tokens);
}
}
16 changes: 16 additions & 0 deletions cs/Markdown/Parser/Rules/TextRule.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
using Markdown.Parser.Nodes;
using Markdown.Tokenizer.Tokens;

namespace Markdown.Parser.Rules;

public class TextRule : IParsingRule
{
public Node? Match(List<Token> tokens, int begin = 0)
{
var textLength = tokens.Skip(begin).TakeWhile(IsText).Count();
return textLength == 0 ? null : new TextNode(begin, textLength, tokens);
}

private static bool IsText(Token token)
=> token.TokenType is TokenType.WORD or TokenType.SPACE;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Number? New Line?

}
39 changes: 39 additions & 0 deletions cs/Markdown/Parser/Rules/Tools/ListMatchExtensions.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
using Markdown.Parser.Nodes;
using Markdown.Tokenizer.Tokens;

namespace Markdown.Parser.Rules.Tools;

public static class ListMatchExtensions
{
public static List<Node> MatchPattern(this List<Token> tokens, List<IParsingRule> pattern, int begin = 0)
{
List<Node> nodes = [];

foreach (var patternRule in pattern)
{
var node = patternRule.Match(tokens, begin);
if (node is null) return [];
nodes.Add(node); begin += node.Consumed;
}
return nodes;
}

public static List<Node> KleenStarMatch(this List<Token> tokens, IParsingRule pattern, int begin = 0)
{
List<Node> nodes = [];
while (true)
{
var node = pattern.Match(tokens, begin);
if (node is null) return nodes;
begin += node.Consumed; nodes.Add(node);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

перенос

}
}

public static Node? FirstMatch(this List<Token> tokens, List<IParsingRule> patterns, int begin = 0)
{
var match = patterns
.Select(rule => rule.Match(tokens, begin))
.FirstOrDefault(node => node is not null, null);
return match;
}
}
10 changes: 10 additions & 0 deletions cs/Markdown/Parser/Rules/Tools/ListOrderExtensions.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
namespace Markdown.Parser.Rules.Tools;

public static class ListOrderExtensions
{
public static T? Second<T>(this List<T> list)
=> list.Count < 2 ? default : list[1];

public static T? Third<T>(this List<T> list)
=> list.Count < 3 ? default : list[2];
}
13 changes: 13 additions & 0 deletions cs/Markdown/Parser/TokenParser.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
using Markdown.Parser.Nodes;
using Markdown.Parser.Rules;
using Markdown.Tokenizer.Tokens;

namespace Markdown.Parser;

public class TokenParser
{
public Node Parse(List<Token> tokens)
{
return new BodyRule().Match(tokens);
}
}
20 changes: 20 additions & 0 deletions cs/Markdown/Program.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
// See https://aka.ms/new-console-template for more information

using Markdown.Generator;
using Markdown.Parser;
using Markdown.Tokenizer;

namespace Markdown;

internal class Program
{
public static void Main(string[] args)
{
var markdown = "This _is_ a __sample__ markdown _file_.\n";
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Советы категории Б: можешь вынести эту строчку в константу, чтобы потом, при добавлении новых примеров, можно было легко между ними переключаться :)

        const string firstExample = "This _is_ a __sample__ markdown _file_.\n";
        const string secondExample = "#This is another __sample__ markdown _file_";
        var markdown = firstExample;

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Окей, сделаю :)


var tokens = new MarkdownTokenizer().Tokenize(markdown);
var astRoot = new TokenParser().Parse(tokens);

Console.WriteLine(new HTMLGenerator().GenerateHTML(astRoot));
}
}
26 changes: 26 additions & 0 deletions cs/Markdown/Tokenizer/MarkdownTokenizer.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
using Markdown.Tokenizer.Scanners;
using Markdown.Tokenizer.Tokens;

namespace Markdown.Tokenizer;

public class MarkdownTokenizer
{
private readonly ITokenScanner[] scanners = [
new SpecScanner(), new NumberScanner(), new WordScanner()
];

public List<Token> Tokenize(string markdown)
{
var begin = 0;
var tokenList = new List<Token>();

while (begin < markdown.Length)
{
var token = scanners
.Select(sc => sc.Scan(markdown, begin))
.First(token => token is not null);
begin += token!.Length; tokenList.Add(token);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Перенеси на новую строчку плиз, глазкам больно ахаха

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ахахахаха, ладно перенесу

}
return tokenList;
}
}
8 changes: 8 additions & 0 deletions cs/Markdown/Tokenizer/Scanners/ITokenScanner.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
using Markdown.Tokenizer.Tokens;

namespace Markdown.Tokenizer.Scanners;

public interface ITokenScanner
{
public Token? Scan(string markdown, int begin = 0);
}
17 changes: 17 additions & 0 deletions cs/Markdown/Tokenizer/Scanners/NumberScanner.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
using Markdown.Tokenizer.Tokens;

namespace Markdown.Tokenizer.Scanners;

public class NumberScanner : ITokenScanner
{
public Token? Scan(string markdown, int begin = 0)
{
var numberIterator = markdown
.Skip(begin)
.TakeWhile(CanScan);
var numberLen = numberIterator.Count();
return numberLen == 0 ? null : new Token(TokenType.NUMBER, begin, numberLen, markdown);
}

public static bool CanScan(char symbol) => char.IsDigit(symbol);
}
Loading