Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat] Add support for censoring XML, HTML and raw text response bodies #52

Draft
wants to merge 6 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,20 @@
# CHANGELOG

## Next Release

- Add support for censoring XML, HTML and plain text bodies
- [BREAKING CHANGE] `CensorElement` now abstract base class, cannot be used directly
- Three types of `CensorElement` options available:
- `KeyCensorElement`: Censor the value of a specified key (will be ignored if used for plain text/HTML data)
- `RegexCensorElement`: Censor any string that matches a specified regex pattern (will check the value of a key-value pair if used for JSON/XML data)
- `TextCensorElement`: Censor a specified string (will check the value of a key-value pair if used for JSON/XML data; requires the whole body to match the specified string if used for plain text/HTML data)
- Body censoring: `KeyCensorElement` (recommended for JSON/XML if key is known), `TextCensorElement` (recommended for JSON/XML if value is known), and `RegexCensorElement` (recommended for plain text/HTML)
- Path element censoring: Use `RegexCensorElement`
- Query parameter censoring: Use `KeyCensorElement`
- Header censoring: Use `KeyCensorElement`
- [BREAKING CHANGE] `CensorHeadersByKeys`, `CensorBodyElementsByKeys`, `CensorQueryParametersByKeys` and `CensorPathElementsByPatterns` removed
- Use `CensorHeaders`, `CensorBodyElements`, `CensorQueryParameters` and `CensorPathElements` instead

## v0.9.0 (2023-05-17)

- Fix a bug where URLs were not being extracted correctly, potentially causing false matches when matching by URL
Expand Down
336 changes: 336 additions & 0 deletions EasyVCR.Tests/CensorsTest.cs
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;
using System.Threading.Tasks;
using System.Xml;
using Microsoft.VisualStudio.TestTools.UnitTesting;

namespace EasyVCR.Tests
Expand Down Expand Up @@ -201,5 +205,337 @@ public void TestApplyPathElementsCensorsNoCensorsReturnsOriginalUrl()

Assert.AreEqual(url, result);
}

/// <summary>
/// Test TextCensorElement works for XML bodies
/// </summary>
[TestMethod]
public async Task TestTextCensorOnXml()
{
var cassette = TestUtils.GetCassette("test_text_censor_on_xml");
cassette.Erase(); // Erase cassette before recording

// set up advanced settings
var censorString = new Guid().ToString(); // generate random string, high chance of not being in original data
var advancedSettings = new AdvancedSettings
{
Censors = new Censors(censorString).CensorBodyElements(
new List<CensorElement>
{
// censor the word "r/ProgrammerHumor"
new TextCensorElement("r/ProgrammerHumor", false),
}),
};

// record cassette with advanced settings first
var client = HttpClients.NewHttpClient(cassette, Mode.Record, advancedSettings);
var fakeDataService = new FakeDataService(client);
var _ = await fakeDataService.GetXmlDataRawResponse();

// now replay cassette
client = HttpClients.NewHttpClient(cassette, Mode.Replay, advancedSettings);
fakeDataService = new FakeDataService(client);
var xmlData = await fakeDataService.GetXmlData();

Assert.IsNotNull(xmlData);
var xmlDocument = new XmlDocument();
xmlDocument.LoadXml(xmlData);

// word "r/ProgrammerHumor" should be censored
// for testing purposes, we know this is the "label" property of the "category" node under "feed"
var categoryNode = xmlDocument.FirstChild?.FirstChild;
Assert.IsNotNull(categoryNode);
Assert.AreEqual(censorString, categoryNode.Attributes["label"].Value);
}

/// <summary>
/// Test KeyCensorElement works for XML bodies
/// </summary>
[TestMethod]
public async Task TestKeyCensorOnXml()
{
var cassette = TestUtils.GetCassette("test_key_censor_on_xml");
cassette.Erase(); // Erase cassette before recording

// set up advanced settings
var censorString = new Guid().ToString(); // generate random string, high chance of not being in original data
var advancedSettings = new AdvancedSettings
{
Censors = new Censors(censorString).CensorBodyElements(
new List<CensorElement>
{
// censor the value of the "title" key
new KeyCensorElement("title", false),
}),
};

// record cassette with advanced settings first
var client = HttpClients.NewHttpClient(cassette, Mode.Record, advancedSettings);
var fakeDataService = new FakeDataService(client);
var _ = await fakeDataService.GetXmlDataRawResponse();

// now replay cassette
client = HttpClients.NewHttpClient(cassette, Mode.Replay, advancedSettings);
fakeDataService = new FakeDataService(client);
var xmlData = await fakeDataService.GetXmlData();

Assert.IsNotNull(xmlData);
var xmlDocument = new XmlDocument();
xmlDocument.LoadXml(xmlData);

// whole value of "title" key should be censored
var nodes = xmlDocument.SelectNodes("//title");
Assert.IsNotNull(nodes);
foreach (XmlNode node in nodes)
{
Assert.AreEqual(censorString, node.InnerText);
}
}

/// <summary>
/// Test RegexCensorElement works for XML bodies
/// </summary>
[TestMethod]
public async Task TestRegexCensorOnXml()
{
var cassette = TestUtils.GetCassette("test_regex_censor_on_xml");
cassette.Erase(); // Erase cassette before recording

// set up advanced settings
var censorString = new Guid().ToString(); // generate random string, high chance of not being in original data
var advancedSettings = new AdvancedSettings
{
Censors = new Censors(censorString).CensorBodyElements(
new List<CensorElement>
{
// censor any value that looks like an date stamp
new RegexCensorElement(@"\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}", false),
}),
};

// record cassette with advanced settings first
var client = HttpClients.NewHttpClient(cassette, Mode.Record, advancedSettings);
var fakeDataService = new FakeDataService(client);
var _ = await fakeDataService.GetXmlDataRawResponse();

// now replay cassette
client = HttpClients.NewHttpClient(cassette, Mode.Replay, advancedSettings);
fakeDataService = new FakeDataService(client);
var xmlData = await fakeDataService.GetXmlData();

Assert.IsNotNull(xmlData);
var xmlDocument = new XmlDocument();
xmlDocument.LoadXml(xmlData);

// all values that look like urls should be censored
// for testing purposes, we know this is stored in the "uri" nodes
var nodes = xmlDocument.SelectNodes("//uri");
Assert.IsNotNull(nodes);
foreach (XmlNode node in nodes)
{
Assert.AreEqual(censorString, node.InnerText);
}
}

[Ignore("Hard to test")]
[TestMethod]
public async Task TestTextCensorOnHtml()
{
// TextCensorHTML censors the whole text in the HTML body
// Would need an HTML page with a small body to test this
Assert.Fail();
}

[Ignore("Can't use KeyCensorElement on HTML bodies")]
[TestMethod]
public async Task TestKeyCensorOnHtml()
{
Assert.Fail("Can't use KeyCensorElement on HTML bodies");
}

[TestMethod]
public async Task TestRegexCensorOnHtml()
{
var cassette = TestUtils.GetCassette("test_regex_censor_on_html");
cassette.Erase(); // Erase cassette before recording

// set up advanced settings
var censorString = new Guid().ToString(); // generate random string, high chance of not being in original data
const string pattern = "<head>.*</head>";
var advancedSettings = new AdvancedSettings
{
Censors = new Censors(censorString).CensorBodyElements(
new List<CensorElement>
{
// censor the pattern
new RegexCensorElement(pattern, false),
}),
};

// record cassette with advanced settings first
var client = HttpClients.NewHttpClient(cassette, Mode.Record, advancedSettings);
var fakeDataService = new FakeDataService(client);
var _ = await fakeDataService.GetHtmlDataRawResponse();

// now replay cassette
client = HttpClients.NewHttpClient(cassette, Mode.Replay, advancedSettings);
fakeDataService = new FakeDataService(client);
var textData = await fakeDataService.GetHtmlData();

Assert.IsNotNull(textData);

// censored pattern should no longer exist, and censor string should exist
Assert.IsFalse(Regex.IsMatch(textData, pattern));
Assert.IsTrue(textData.Contains(censorString));
}

/// <summary>
/// Test TextCensorElement works for plain text bodies
/// </summary>
[TestMethod]
public async Task TestTextCensorOnText()
{
var cassette = TestUtils.GetCassette("test_text_censor_on_text");
cassette.Erase(); // Erase cassette before recording

// set up advanced settings
var censorString = new Guid().ToString(); // generate random string, high chance of not being in original data
const string textToCensor = "# UGAArchive\nArchives of projects I did as a student at The University of Georgia\n";
var advancedSettings = new AdvancedSettings
{
Censors = new Censors(censorString).CensorBodyElements(
new List<CensorElement>
{
// censor the text
new TextCensorElement(textToCensor, false),
}),
};

// record cassette with advanced settings first
var client = HttpClients.NewHttpClient(cassette, Mode.Record, advancedSettings);
var fakeDataService = new FakeDataService(client);
var _ = await fakeDataService.GetRawDataRawResponse();

// now replay cassette
client = HttpClients.NewHttpClient(cassette, Mode.Replay, advancedSettings);
fakeDataService = new FakeDataService(client);
var textData = await fakeDataService.GetRawData();

Assert.IsNotNull(textData);

// censored word should no longer exist, and censor string should exist
Assert.IsFalse(textData.Contains(textToCensor));
Assert.IsTrue(textData.Contains(censorString));
}

[Ignore("Can't use KeyCensorElement on plain text bodies")]
[TestMethod]
public async Task TestKeyCensorOnText()
{
Assert.Fail("Can't use KeyCensorElement on plain text bodies");
}

/// <summary>
/// Test RegexCensorElement works for plain text bodies
/// </summary>
[TestMethod]
public async Task TestRegexCensorOnText()
{
var cassette = TestUtils.GetCassette("test_regex_censor_on_text");
cassette.Erase(); // Erase cassette before recording

// set up advanced settings
var censorString = new Guid().ToString(); // generate random string, high chance of not being in original data
const string pattern = "^# UGAArchive";
var advancedSettings = new AdvancedSettings
{
Censors = new Censors(censorString).CensorBodyElements(
new List<CensorElement>
{
// censor the pattern
new RegexCensorElement(pattern, false),
}),
};

// record cassette with advanced settings first
var client = HttpClients.NewHttpClient(cassette, Mode.Record, advancedSettings);
var fakeDataService = new FakeDataService(client);
var _ = await fakeDataService.GetRawDataRawResponse();

// now replay cassette
client = HttpClients.NewHttpClient(cassette, Mode.Replay, advancedSettings);
fakeDataService = new FakeDataService(client);
var textData = await fakeDataService.GetRawData();

Assert.IsNotNull(textData);

// censored pattern should no longer exist, and censor string should exist
Assert.IsFalse(Regex.IsMatch(textData, pattern));
Assert.IsTrue(textData.Contains(censorString));
}

/// <summary>
/// Test that we can mix and match censor elements
/// </summary>
[TestMethod]
public async Task TestMixAndMatchCensorElements()
{
var cassette = TestUtils.GetCassette("test_mix_and_match_censor_elements");
cassette.Erase(); // Erase cassette before recording

// set up advanced settings
var censorString = new Guid().ToString(); // generate random string, high chance of not being in original data
var advancedSettings = new AdvancedSettings
{
Censors = new Censors(censorString).CensorBodyElements(
new List<CensorElement>
{
// censor the word "r/ProgrammerHumor"
new TextCensorElement("r/ProgrammerHumor", false),
// censor the value of the "title" key
new KeyCensorElement("title", false),
// censor any value that looks like an date stamp
new RegexCensorElement(@"\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}", false),
}),
};

// record cassette with advanced settings first
var client = HttpClients.NewHttpClient(cassette, Mode.Record, advancedSettings);
var fakeDataService = new FakeDataService(client);
var _ = await fakeDataService.GetXmlDataRawResponse();

// now replay cassette
client = HttpClients.NewHttpClient(cassette, Mode.Replay, advancedSettings);
fakeDataService = new FakeDataService(client);
var xmlData = await fakeDataService.GetXmlData();

// check that the xml data was censored
Assert.IsNotNull(xmlData);
var xmlDocument = new XmlDocument();
xmlDocument.LoadXml(xmlData);

// word "r/ProgrammerHumor" should be censored
// for testing purposes, we know this is the "label" property of the "category" node under "feed"
var categoryNode = xmlDocument.FirstChild?.FirstChild;
Assert.IsNotNull(categoryNode);
Assert.AreEqual(censorString, categoryNode.Attributes["label"].Value);

// whole value of "title" key should be censored
var nodes = xmlDocument.SelectNodes("//title");
Assert.IsNotNull(nodes);
foreach (XmlNode node in nodes)
{
Assert.AreEqual(censorString, node.InnerText);
}

// all values that look like urls should be censored
// for testing purposes, we know this is stored in the "uri" nodes
nodes = xmlDocument.SelectNodes("//uri");
Assert.IsNotNull(nodes);
foreach (XmlNode node in nodes)
{
Assert.AreEqual(censorString, node.InnerText);
}
}
}
}
Loading