Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: errata validation #693

Merged
merged 13 commits into from
Sep 3, 2024

Conversation

Rossi-Luciano
Copy link
Collaborator

O que esse PR faz?

Este PR implementa a validação de artigos relacionados em documentos XML, garantindo que para cada artigo de errata ou correção, os elementos relacionados estejam presentes e sejam consistentes. Especificamente, ele valida:

  • Se para um artigo de errata, existe pelo menos um <related-article> do tipo "corrected-article".
  • Se para um artigo corrigido, existe um número correspondente de <related-article> do tipo "correction-forward" e <history> do tipo "corrected" no histórico do artigo.

Essas validações asseguram a integridade das referências entre artigos, conforme as especificações do sistema.

Onde a revisão poderia começar?

A revisão pode começar pelo arquivo onde as classes ErrataValidation e CorrectedArticleValidation estão definidas, especificamente nos métodos de validação de artigos relacionados.

Como este poderia ser testado manualmente?

  1. Carregue um documento XML contendo artigos do tipo "correction" e "errata".
  2. Utilize as classes ErrataValidation e CorrectedArticleValidation para validar os artigos.
  3. Verifique se os seguintes cenários são corretamente validados:
  • Um artigo de errata deve conter pelo menos um <related-article related-article-type="corrected-article">.
  • Um artigo corrigido deve ter um <related-article related-article-type="correction-forward"> para cada <date date-type="corrected"> no histórico.
  1. Avalie se as respostas de validação retornadas refletem corretamente a presença ou ausência dos elementos esperados.
  2. Existem testes automatizados que cobrem os itens descritos anteriormente.

Algum cenário de contexto que queira dar?

NA.

Screenshots

NA.

Quais são tickets relevantes?

TK #691

Referências

NA.

# Classe base para validação de artigos, implementando a lógica comum de validação de artigos relacionados
class ValidationBase:
def __init__(self, xml_tree, related_article_type):
# Inicializa a árvore XML e configura o tipo de artigo relacionado a ser validado
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano para documentação da classe use """, pois os textos marcados desta forma geram documentação. Já os comentários com # não, mas que devem conter algo a mais de detalhes para a compreensão de outros desenvolvedores.

Copy link
Member

@robertatakenaka robertatakenaka Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano Existe um guia de estilo do python chamado PEP8. https://peps.python.org/pep-0008/#block-comments
Em resumo, os comentários devem anteceder os blocos de códigos. E os comentários na mesma linha ou desnecessários porque o código já está explícito devem ser evitados.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 35 to 48
if (
not self.related_articles
): # Se não houver artigos relacionados, retorna um erro
yield self._format_error_response(
expected=f'at least one <related-article related-article-type="{expected_related_article_type}">',
advice=f'provide <related-article related-article-type="{expected_related_article_type}">',
error_level=error_level,
)
else:
# Se houver artigos relacionados, valida cada um e retorna um sucesso
for article in self.related_articles:
yield self._format_success_response(
article, expected_message, error_level
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano tente usar a estrutura for+ else

for n in range(2, 10):
    for x in range(2, n):
        if n % x == 0:
            print(n, 'equals', x, '*', n//x)
            break
    else:
        # loop fell through without finding a factor
        print(n, 'is a prime number')

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

):
if (
self.article_type == "correction"
): # Verifica se o tipo de artigo é uma correção
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano comentário desnecessário

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano No lugar de "correction", passe o valor como parâmetro (expected_article_type)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expected_message,
advice,
error_level="ERROR",
):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano adicionar docstring no lugar de #. O conteúdo pode ser Valida somente se o artigo é do tipo errata

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# Método auxiliar para formatar uma resposta de erro
def _format_error_response(self, expected, advice, error_level):
return format_response(
title="errata",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano o valor de title não pode ser errato por dois motivos: 1. isso é uma classe Base, deveria ter um conteúdo genérico; 2. O que está validando deveria ser o title

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# Método auxiliar para formatar uma resposta de sucesso
def _format_success_response(self, article, expected_message, error_level):
return format_response(
title="errata",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano mesmo comentário anterior

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano estes dois métodos format não me parecem úteis

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


# Classe base para validação de artigos, implementando a lógica comum de validação de artigos relacionados
class ValidationBase:
def __init__(self, xml_tree, related_article_type):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano tem que adicionar o expected_article_type como parâmetro

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trocar related_article_type por expected_related_article_type

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ou seja, você terá os dois parâmetros: expected_article_type e expected_related_article_type

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# Método principal de validação de artigos relacionados, que pode ser sobrescrito por subclasses
def validate_related_article(
self,
expected_related_article_type,
Copy link
Member

@robertatakenaka robertatakenaka Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano você não está usando o self.expected_related_article_type

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 129 to 137
# Verifica se o número de datas corrigidas é menor que o número de artigos relacionados
history_date_count = len(self.history_dates)
related_article_count = len(self.related_articles)

if history_date_count < related_article_count:
# Retorna um erro se o número de datas corrigidas for insuficiente
yield format_response(
title="errata",
parent="article",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano separa em outra validação

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@robertatakenaka robertatakenaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano no geral ficou muito bom a questão da herança. Mas tem algumas questões de estilo do python PEP8 sobre os comentários. Além disso, outras melhorias e um bug.

]
self.validation_title = validation_title
self.article_lang = xml_tree.get("{http://www.w3.org/XML/1998/namespace}lang")
self.article_type = xml_tree.xpath(".")[0].get("article-type")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano para estes casos use o find que pega o nó diretamente

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super().__init__(xml_tree, related_article_type="corrected-article")
super().__init__(xml_tree, related_article_type="corrected-article", validation_title="errata")

def validate_related_article(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano você pode eliminar este método inteiro, pois a classe Base já o implementou de forma genérica.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

class ValidationBase:
def __init__(self, xml_tree, related_article_type):
# Inicializa a árvore XML e configura o tipo de artigo relacionado a ser validado
def __init__(self, xml_tree, related_article_type, validation_title):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano tinha pedido para você adicionar expected_article_type e expected_related_article_type (mudar o nome de related_article_type)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

advice=f'provide <related-article related-article-type="{expected_related_article_type}">',
)

def _get_related_articles(self, xml_tree, related_article_type):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano aqui vc não precisa de xml_tree, related_article_type como parâmetros já que eles estariam acessíveis no momento que se instancia a classe. Crie atributos com eles self.expected_related_article_type e use self.xml_tree.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yield self._format_error_response(
expected=f'at least one <related-article related-article-type="{expected_related_article_type}">',
advice=f'provide <related-article related-article-type="{expected_related_article_type}">',
def validate_related_article(self, expected_article_type, expected_related_article_type, expected_response,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano o único parâmetro realmente necessário é o error_level. Os expected_article_type e expected_related_article_type tem que estar no init e o expected_response, você cria usando o expected_related_article_type

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yield self._format_response(
title=title,
validation_type="exist",
expected=f'at least one <related-article related-article-type="{self.expected_related_article_type}">',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano expected=expected_response,

title=title,
parent=related_article.get("parent") if related_article else "article",
parent_id=related_article.get("parent_id") if related_article else None,
parent_article_type="correction",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano na classe base não deveria ter dados definidos. Trocar 'correction' por variável

"""
yield from super().validate_related_article(error_level=error_level, title=title)

def validate_related_articles_and_history_dates(self, error_level="ERROR"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trocar por validate_history_dates

]
}
]
for i, item in enumerate(obtained):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano expected

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano em todos adiciton o assert para verificar o tamanho da lista obtained

"data": None,
}
]
for i, item in enumerate(obtained):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano expected

},
}
]
for i, item in enumerate(obtained):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano expected

Copy link
Member

@robertatakenaka robertatakenaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano fazer correções

@@ -70,7 +70,7 @@ def _format_response(self, title, validation_type, expected, error_level, is_val
title=title,
parent=related_article.get("parent") if related_article else "article",
parent_id=related_article.get("parent_id") if related_article else None,
parent_article_type="correction",
parent_article_type=related_article.get("parent_article_type") if related_article else None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano forma alternativa:

parent_article_type=related_article and related_article.get("parent_article_type")

@@ -25,7 +25,7 @@ def test_validate_related_article_not_found(self):
"title": "validation matching 'correction' and 'corrected-article'",
"parent": "article",
"parent_id": None,
"parent_article_type": "correction",
"parent_article_type": None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano Por que é None? não deveria ser 'correction'?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robertatakenaka corrigido

Copy link
Member

@robertatakenaka robertatakenaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano Verificar os testes

Comment on lines 71 to 74
parent=related_article.get("parent") if related_article else "article",
parent_id=related_article.get("parent_id") if related_article else None,
parent_article_type=related_article.get("parent_article_type") if related_article else self.article_type,
parent_lang=related_article.get("parent_lang") if related_article else self.article_lang,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rossi-Luciano olhando estes vários if que são todos referentes a existência ou não do related_article, fica evidente que def _format_response é desnecessário. Esta verificação de existência já ocorreu no trecho das linhas 31 a 52. Sugiro remover e adequar o trecho 31 a 52.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robertatakenaka robertatakenaka merged commit d96beaa into scieloorg:master Sep 3, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants