How to capture and analyze JavaScript Error

Front-end engineers know that JavaScript has basic exception handling capabilities. We can throw new Error(), and the browser will also throw an exception when we call the API and make an error. But it is estimated that most front-end engineers have not considered collecting such abnormal information. Anyway, as long as the JavaScript error does not reappear after refreshing, the user can solve the problem by refreshing, the browser will not crash, and it will be fine when it has not happened. This assumption was still valid before the popularity of Single Page App. The current Single Page App has been running for a period of time and the status is extremely complicated. The user may have performed a number of input operations before coming here. If it is said that it is refreshed, it will be refreshed? Shouldn’t the previous operation be completely redone? So we still need to capture and analyze these abnormal information, and then we can modify the code to avoid affecting the user experience.

Ways to catch exceptions

We wrote it myself throw new Error()want to capture the capture of course, because we know very well throwwritten where we are. But exceptions that occur when calling browser APIs are not necessarily so easy to catch. Some APIs are written in the standard to throw exceptions, and some APIs only throw exceptions due to implementation differences or defects in individual browsers. For the former we can through try-catchacquisition, for which we must listen and capture the global exception.

try-catch

If some browser API is known to throw an exception, then we need to call into the try-catchinside, avoid the mistakes which led to the entire program into the illegal state. For example say window.localStorageis one such API, after writing data exceeds the capacity limit will throw an exception , will be the case in Safari’s private browsing mode.

try {
  localStorage.setItem('date', Date.now());
} catch (error) {
  reportError(error);
}

Another common try-catchapplication scenarios are callback. Because the code of the callback function is out of our control, we don’t know the quality of the code and whether it will call other APIs that throw exceptions. In order not to cause other because of an error correction code can not execute the callback call, so the call back to the place try-catchwhich is a must.

listeners.forEach(function(listener) {
  try {
    listener();
  } catch (error) {
    reportError(error);
  }
});

window.onerror

For try-catchcovering the place, if only through unusual window.onerrorto catch up.

window.onerror =
  function(errorMessage, scriptURI, lineNumber) {
    reportError({
      message: errorMessage,
      script: scriptURI,
      line: lineNumber
    });
}

Be careful not to get smart use window.addEventListeneror window.attachEventform to monitor window.onerror. Many browsers only achieved window.onerror, or only window.onerrorimplementation is standard. Considering the draft standard is defined window.onerror, we use window.onerrorjust fine.

Attributes are missing

Suppose we have a reportErrorfunction to collect the captured exception, and then batch sent to the server storage for analysis, then we want what information is collected it? More useful information includes: error type ( name), error message ( message), script file address ( script), line number ( line), column number ( column), stack trace ( stack). If an exception is through try-catchcaptured, this information is in Erroron the object (major browsers support), it reportErroris also able to collect such information. But if it is through window.onerrorcaptured, we all know that this event is only a function of three parameters, the information other than these three parameters are lost.

Serialized message

If Errorthe object is created in our own words, so error.messagethat by our control. Basically what we put into error.messageit, window.onerrorthe first parameter ( message) What would be yes. (In fact, the browser will be slightly modified, for example, add 'Uncaught Error: 'a prefix.) Therefore, we can attribute our attention serialization (for example JSON.Stringify) after the deposit to the error.messageinside, then window.onerrorread out the de-serialization on it. Of course, this is limited to our own created Errorobject.

The fifth parameter

We also know that browser vendors use window.onerrorrestrictions imposed when it began to window.onerrorabove to add new parameters. Considering that only the row number and not the column number does not seem to be very symmetrical, IE first adds the column number and puts it in the fourth parameter. However, everyone is more concerned about whether you can get the complete stack, so Firefox said it’s better to put the stack in the fifth parameter. Chrome but it might as well say the whole Errorobject in the fifth parameter, you may want to read what attributes, including custom attributes. Because Chrome faster action results in Chrome 30 to achieve a new window.onerrorsignature, leading to the draft standard will follow this writing.

window.onerror = function(
  errorMessage,
  scriptURI,
  lineNumber,
  columnNumber,
  error
) {
  if (error) {
    reportError(error);
  } else {
    reportError({
      message: errorMessage,
      script: scriptURI,
      line: lineNumber，
      column: columnNumber
    });
  }
}

Attribute normalization

We discussed earlier Errorthe object properties, which are based on Chrome naming names, but different browsers on Errordifferent naming the object properties, such as a script file named in the Chrome address scriptbut Firefox is called filename. Therefore, we also need a special function for Errornormalizing an object, that is, the different attribute names are mapped to a single property name. The specific approach can refer to this article . Although the browser implementation will be updated, it is not too difficult to maintain such a mapping table manually.

The similar is stackthe format of the stack trace ( ). This attribute saves a copy of the stack information when the exception occurred in the form of plain text. Because the text format used by each browser is different, it is also necessary to maintain a regular expression manually to extract the function of each frame from the plain text. Name ( identifier), file ( script), line number ( line), and column number ( column).

Security restrictions

If you have encountered the message is 'Script error.'wrong, you will understand what I’m saying, this is actually a browser limitations for different source (origin) script file. The reason security restrictions like this: Suppose a user logs in online banking in the HTML returned with anonymous users see HTML is not the same, a third-party website will be able to put the URI home online banking into script.srcproperty inside. Of course, HTML cannot be parsed as JS, so the browser will throw an exception, and this third-party website can determine whether the user is logged in by analyzing the location of the exception. To this end all be abnormal for a different source browser script files thrown filtration, filtration leaving only 'Script error.'such a constant message, all other properties disappear.

For websites of a certain scale, it is normal for the script files to be placed on the CDN with different sources. Now even if you build a small website yourself, common frameworks such as jQuery and Backbone can directly reference the version on the public CDN to speed up user downloads. So this security restriction did cause some trouble, and the exception information we collected from Chrome and Firefox was useless 'Script error.'.

CORS

If you want to bypass this restriction, you only need to ensure that the script file and the page itself are of the same origin. But putting script files on a server that is not accelerated by CDN will slow down the download speed of users? One solution is to continue the script file on the CDN, use XMLHttpRequestby CORS to download content to come back, and then create a <script>label which is injected into the page. The code embedded in the page is of course the same source.

This is simple to say, but there are many details in the implementation. To use a simple example:

<script src="http://cdn.com/step1.js"></script>
<script>
  (function step2() {})();
</script>
<script src="http://cdn.com/step3.js"></script>

We all know that if there are dependencies between step1, step2, and step3, they must be executed in strict accordance with this order, otherwise errors may occur. The browser can request the files of step1 and step3 in parallel, but the order is guaranteed during execution. If we own by XMLHttpRequestacquiring step1 step3 and content of the document, we need to ensure the correctness of the order itself. In addition, don’t forget step2, step2 can be executed when step1 is downloaded in non-blocking form, so we must also intervene step2 to make it wait for step1 to complete before executing.

If we have a complete set of tools to generate different pages on the site <script>label, then we need to adjust the set of tools to make it to <script>the label to make changes:

<script>
  scheduleRemoteScript('http://cdn.com/step1.js');
</script>
<script>
  scheduleInlineScript(function code() {
    (function step2() {})();
  });
</script>
<script>
  scheduleRemoteScript('http://cdn.com/step3.js');
</script>

We need to achieve scheduleRemoteScriptand scheduleInlineScriptthese two functions, and ensure that they are in the first reference to an external script file <script>to be well defined before the tag, and then the rest of the <script>label will be rewritten to this form above. Note immediate execution of the original step2function was placed in a larger codefunction inside. codeThe function will not be executed, it is just a container, so that the original step2 code can be retained without escaping, but it will not be executed immediately.

Next, we also need to implement a comprehensive mechanism to ensure that these made scheduleRemoteScriptthe address to download the file back by the content and scheduleInlineScripta direct implementation of the code can get in the correct order a pick. I won’t give the detailed code here. If you are interested, you can implement it yourself.

Line number reverse check

Obtaining content through CORS and then injecting code into the page can break through security restrictions, but it will introduce a new problem, that is, line number conflicts. Originally by error.scriptyou can locate only the script file, and then through error.linecan be targeted to a unique line number. Now that the code is embedded in the page, multiple <script>tags and can not error.scriptbe distinguished, however, each <script>internal tag line numbers are counted from 1, the result will lead us to not use the source code location where the error abnormal positioning information .

In order to avoid conflicts line number, and we will waste some line numbers, so that each <script>label has a line number range used by the actual code does not overlap. Take, for example, assume that each <script>tag is not more than 1000 lines of actual code, so I can make the first <script>of code label occupies the 1-1000 line, so that the second <script>code is first occupied in the 1001 label 2000 lines (1000 in front insert blank lines), the third <script>tag code occupies the seed row 2001-3000 (2000 front insert blank lines), and so on. Then we use the data-*property record information, easy to reverse lookup.

<script
  data-src="http://cdn.com/step1.js"
  data-line-start="1"
>
  // code for step 1
</script>
<script data-line-start="1001">
  // '\n' * 1000
  // code for step 2
</script>
<script
  data-src="http://cdn.com/step3.js"
  data-line-start="2001"
>
  // '\n' * 2000
  // code for step 3
</script>

After such treatment, if an error error.lineis 3005, then it means that the actual error.scriptshould be 'http://cdn.com/step3.js', but the actual error.lineshould be 5. We mentioned before can reportErrorfinish the line number reverse lookup function inside job.

Of course, since we can not guarantee that every script file only 1000 lines, there may be some script files are significantly smaller than the 1000 line, so in fact do not need fixed allocation interval 1000 lines to each <script>tag. We can be allocated according to the number of actual script line interval, as long as each <script>interval label used by non-overlapping it.

crossorigin attribute

Browser security restrictions for different sources of content, of course, not limited to the <script>label. Since it XMLHttpRequestcan break the limit by CORS, why the label by reference to direct resources can not? Of course it is possible.

For <script>label references limitations of the various sources of the script file the same effect on <img>the label refers to a different source image file. If a <img>label is a different source, then, once in <canvas>use when drawing, which <canvas>becomes write-only state, to ensure that the site can not steal JavaScript unauthorized pictures of different data sources. Later <img>label by introducing a crossoriginsolution to this problem property. If used crossorigin="anonymous", it is equivalent to anonymous CORS; if you use `crossorigin=”use-credentials”, it is equivalent to CORS with authentication.

Since the <img>label can do this, why <script>labels can not do that? So for the browser vendors to <script>label joined the same crossoriginproperty is used to solve the security restrictions. Now Chrome and Firefox support this attribute is completely no problem. Safari will put crossorigin="anonymous"as crossorigin="use-credentials"processing, the result is if the server supports only anonymous authentication as CORS then Safari will fail. Because the CDN server is designed to only return static content for performance reasons, it is impossible to dynamically return the HTTP header required for CORS authentication according to the request. Safari is equivalent to not being able to use this feature to solve the above problems.

to sum up

JavaScript exception handling looks very simple, no different from other languages, but it is actually not that easy to catch all the exceptions and analyze the attributes. Although there are some third-party services that provide Google Analytics-like services that capture JavaScript exceptions, if you want to understand the details and principles, you must do it yourself.