feat(functions): add extractTextFromHTML (#1)
Initially it was planned to use JSDOM NPM package for `extractTextFromHTML` function, however placement of JSDOM in the dependancy system prompted significant contemplation: - placement of JSDOM into production dependancies makes this package 3MB bigger, which is just plain out horrible considering JSDOM has use in just one function; - placement of JSDOM into optionalDependancies makes it possible not to install it (however only if specifically you tell the CLI not to, essentially making it almost no different from production dependancies in terms of size as noone ever would spend time carefully researching an NPM package documentation just to tell whether you need to use a very specific option or no) and adding additional 0.4s - 0.5s slowdown to the function due to dynamic import; - lastly, placement of JSDOM into peerDependancies simply has no use in this case as logic of peerDependancies is much more complex than it would be necessary in this case. Instead I chose to add another argument to the function named `domParser` where you are meant to provide DOMParser of your choise, that way if @resultium/utils are used in the browser there is no additional load time or slowdowns due to usage of browser default DOMParser and on the server you are able to define your own parser, whether it be JSDOM, cheerio, puppeteer or whatever you please. However JSDOM has been added as a devDependancy in order to make tests possible, as they run on the server side. Resolves #1
This commit is contained in:
11
tests/extractTextFromHTML.test.ts
Normal file
11
tests/extractTextFromHTML.test.ts
Normal file
@@ -0,0 +1,11 @@
|
||||
import { expect, test } from "@jest/globals";
|
||||
import { extractTextFromHTML } from "../src";
|
||||
import { JSDOM } from "jsdom";
|
||||
|
||||
const domParser = new new JSDOM().window.DOMParser();
|
||||
|
||||
test("extracts text content from an HTML string", async () => {
|
||||
let HTML = "<p><a>Lorem ipsum</a> dolor sit</p>";
|
||||
|
||||
expect(extractTextFromHTML(HTML, domParser)).toBe("Lorem ipsum dolor sit");
|
||||
});
|
||||
Reference in New Issue
Block a user