I'm not sure what's going on, but ever since I updated my M1 mac mini server (from macOS Ventura 13.5 to 13.6) I've had a few times when the web server goes unresponsive. This is running FMS 19.6.3.
Symptoms:
FileMaker server is fine, one can connect to databases with FileMaker Pro
Apache / httpd web server is unresponsive. You can not connect to / or to /fmi/webd or any other URL
you can PING the server
there are no crash reports on Console.app
My hunch is that apache/httpd was upgraded in 13.6, and there's some bug where it stops responding.
If it happens again I'll do more diagnostic work and report back.
Thanks @Malcolm and @Kirk for suggestions. It's been running for 11 days now, and the only thing I've done so far is... nothing. I'm waiting for it to fail again and will report back when it does.
Still running strong for over 14 days now. I changed nothing on the FileMaker side. Why is it working better now? I have several ideas:
I increased the server monitoring frequency from once every 60 minutes to every 15 minutes. This is a simple cronJob which runs a curl script looking for a HTTP/1.1 200 reply. Could this increased access be doing something good, such as preventing the proxy, web server, or web direct process from going to sleep? Possibly. However, the monitor only runs from 0700 through 2300 hours, so it's not running for 8 hours every night, so this theory seems less likely.
I updated the firmware on my router/firewall. What if there is some "packet of death" type attack that could cause the proxy to die, and the firewall is now blocking this attack?
Perhaps there was some sort of attack, which has now stopped, and the timing is concidence?
one of our clients sent out a mass email containing a link to the WebDirect website, about 10 minutes before the crash.
The first evidence of a problem in the log files:
[Thu Nov 09 11:11:47.837883 2023] [proxy_http:error] [pid 53934] (20014)Internal error (specific information not available): [client A.B.C.D:3204] AH01102: error reading status line from remote server 127.0.0.1:16021
So, something running on port 16021 on localhost appears to have failed. what process is it?
Fishing around for java logs, I see this: /Library/FileMaker Server/publishing-engine/jwpc-tomcat/logs/catalina.2023-11-09.log
which appears to show the fault:
09-Nov-2023 11:11:45.892 WARNING [http-nio-127.0.0.1-16021-exec-8] org.apache.catalina.connector.Request.startAsync Unable to start async because the following classes in the processing chain do not support async [com.filemaker.jwpc.filter.JWPCFilter,com.FileMaker.jwpc.filter.PushRequestValidator]
java.lang.IllegalStateException: A filter or servlet of the current chain does not support asynchronous operations.
at org.apache.catalina.connector.Request.startAsync(Request.java:1719)
at org.apache.catalina.connector.RequestFacade.startAsync(RequestFacade.java:1050)
at javax.servlet.ServletRequestWrapper.startAsync(ServletRequestWrapper.java:402)
at javax.servlet.ServletRequestWrapper.startAsync(ServletRequestWrapper.java:402)
at org.atmosphere.cpr.AtmosphereRequestImpl.startAsync(AtmosphereRequestImpl.java:633)
at org.atmosphere.container.Servlet30CometSupport.suspend(Servlet30CometSupport.java:94)
at org.atmosphere.container.Servlet30CometSupport.service(Servlet30CometSupport.java:69)
at org.atmosphere.cpr.AtmosphereFramework.doCometSupport(AtmosphereFramework.java:2297)
at com.vaadin.server.communication.PushRequestHandler.handleRequest(PushRequestHandler.java:234)
at com.vaadin.server.VaadinService.handleRequest(VaadinService.java:1609)
at com.vaadin.server.VaadinServlet.service(VaadinServlet.java:448)
at com.filemaker.jwpc.iwp.application.AppServlet.service(Unknown Source)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:733)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:227)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
at com.filemaker.jwpc.filter.JWPCFilter.doFilter(Unknown Source)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
at com.filemaker.jwpc.filter.PushRequestValidator.doFilter(Unknown Source)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:202)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:97)
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:542)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:143)
at com.filemaker.tomcat.FMErrorReportValve.invoke(Unknown Source)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:78)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:357)
at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:374)
at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65)
at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:893)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1707)
at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.base/java.lang.Thread.run(Unknown Source)ava.lang.Thread.run(Unknown Source)
Claris stopped bundling Java. You would have seen the notices to install Java when you were installing FMS. I don’t have the correct version details on hand but they should be on the help/install pages.
Each FMS version is compatible with a specific java version. You should check with which Java version FMS 19.6.3 is compatible and install only that version. You may as well check if more than one version are installed.
FileMaker Server 19.2.1 through FileMaker Server 19.6.3 --> Claris International Inc. recommends the latest version of Java 11, for either OpenJDK or Oracle
Very intersting - it happened again (the proxy died). Also, while doing some tests, I noticed that an idle webdirect page was getting the "1x JSON error":
Communication problem
Take note of any unsaved data, and click here to continue. Invalid JSON from server: 1X
More info: I have two nearly identical servers, and am having the issue only on one of them.
Both are 2020 Mac Minis with the M1 chip
both have FMS 19.6.3.302
the "good" server is on macOS Ventura 13.5
the "bad" server is on macOS Ventura 13.6
the "good" server is running Java version Temurin-11.0.16.1 x86_64
the "bad" server is running Java version Temurin-11.0.17+8 arm64
My recollection is that the problems started about the time I upgraded the "bad" server from 13.5 to 13.6. However, the fact that the two servers are on different Java versions, and the bad server is running the Apple Silicon version of Java could be important.
it turns out the client was sending mass emails to several thousand customers with a link to the WebDirect site. I didn't know they were doing this, but once they told me this, the pattern became clear. Each email blast is followed by the proxy server crashing within in a few minutes. Logs show something linke 1500 hits to the /fmi/webd/ url within under a minute. We've repeated this several times now and it's reproducible.
It's not actually humans clicking the URL in the email to the WebDirect site, but rather some sort of email malware scanners accessing the URL.
I don't know whether it's the volume of hits that is crashing the proxy or some other issue (such as a malformed HTTP request, bad headers, etc.) but in any case I'm working with the client to say "don't do that" (don't include WebDirect URLs in the emails).
one bit of good news. I discovered that I no longer have to reboot the entire server, rather I can issue these commands:
fmsadmin restart SERVER
fmsadmin restart WPE
and the WebDirect comes back up. Still annoying as we have kick offline any FMPro users, but quicker & less disruptive then a full server reboot.