|
News archive: Emergency Maintenance
The former Trident Networks/Speedyweb server "Neptune" has been experiencing issues lately. In order to ensure future stability and performance of the sites served from this machine we've decided to migrate them to more reliable servers. Accounts will be moved to newer and faster servers, either running UNIX or Windows, depending on the if the website relies on FrontPage extensions. E-mail and any MySQL databases will be migrated to UNIX servers.
We apologize for any inconvenience that this migration may cause you and will be more then happy to answer any questions you may have; we will be working with you to resolve any issues that may arise due to this migration.
Users on Neptune will be contacted directly via our helpdesk ticketing system with more details. Please respond to the helpdesk ticket and/or call us at 877-720-0483 option #3. We will have staff onsite 24 hours a day, 7 days a week during this migration and they will be able to help you with any problems that you may have. If you believe that your e-mail address with us may be out of date we highly recommend that you respond to this ticket or call us at 877-720-0483 option #2 during business hours and update your contact information with an Account Manager.
Thank you for your patience.
posted by Chuck G. at 05:17 PM on Tuesday, May 6, 2008
Categories: Emergency Maintenance, Hosting Servers, Mail, MySQL hosting
Just a reminder, tonight at 11 PM we will be performing an emergency maintenance on part of our electrical system. This will have an impact on a portion of our shared web and database servers, and eight dedicated and colocated servers. The shared hosting servers affected are:
acacia, acorn, alder, aralia, arrowwood, ash, aspen, avocado, balsa, balsam, bamboo, banana, banyan, bayberry, beech, bigleaf, blackberry, boojum, boxwood, bubinga, buckeye, cactus, cedar, cherry, chestnut, cholla, cinnamon, clover, columbia, commerce2, commerce3, cork, cottonwood, db, deerwood, dogwood, ebony, elm, evergreen, ficus, fig, filbert, fuji, gingko, grape, grapevine, hackberry, hazel, hemlock, hickory, ironbark, ivy, kentia, kola, kudzu, larch, laurel, lilac, lime, madrona, magnolia, mango, mangrove, maple, mesquite, mimosa, moringa, mulberry, myrtle, newninewire, ninewire2, olive, orchid, palmetto, papaya, pear, pecan, plum, poplar, privet, quince, redbud, sassafras, savin, sequoia, sherwood, snowberry, spiceberry, spruce, strawberry, sycamore, tamarack, teak, truffula, tutsan, walnut, woodpecker, yucca, yulan/fern
Every effort will be made to minimize the downtime. Our electricians have estimated the time required to complete the task at 2 hours.
Thank you for your patience and understanding while we strive to build a better facility.
Update 1:15 AM: The maintenance went very smoothly and was completed in under 90 minutes. We were able to supply alternate power to the few dedicated and colo servers involved in the maintenance. We also took the opportunity to replace some failed parts in a couple of servers (acorn for example whose fan inside the power supply had stopped working.)
Again, thank you so much for your patience during this critical maintenance interval.
posted by Chuck G. at 07:30 PM on Tuesday, March 11, 2008
Categories: Emergency Maintenance, Hosting Servers, alder.forest.net, arrowwood.forest.net, aspen.forest.net, balsa.forest.net, bamboo.forest.net, banana.forest.net, bayberry.forest.net, bigleaf.forest.net, boysenberry.forest.net, cactus.forest.net, cedar.forest.net, chestnut.forest.net, cinnamon.forest.net, columbia.forest.net, elm.forest.net, evergreen.forest.net, fuji.forest.net, hazel.forest.net, kentia.forest.net, kola.forest.net, laurel.forest.net, lime.forest.net, olive.forest.net, orchid.forest.net, palmetto.forest.net, pear.forest.net, quince.forest.net, sassafras.forest.net, sherwood.forest.net, spruce.forest.net, sumac.forest.net, sycamore.forest.net, tamarack.forest.net, tutsan.forest.net, www.ninewire.com
Tuesday night March 11th, Wednesday morning March 12th we will be performing emergency maintenance on our power infrastructure related to the installation of our new UPS system. This maintenance will impact a small portion of our shared hosting clients. We will provide specific times in the next couple days as they firm up.
We will strive to keep this outage as short as possible but we are allowing up to two hours of downtime.
Thanks for your understanding while we grow to serve you better.
posted by at 05:25 PM on Friday, March 7, 2008
Categories: Emergency Maintenance, Hosting Servers
The gold.wwwnexus.com hosting server has experienced a hardware failure. We are working on repairing it, and will update as soon as we have more information.
posted by Bill D. at 01:05 PM on Friday, March 7, 2008
Categories: Emergency Maintenance
One of our mail servers, smtp.forest.net, is down to investigate recurring problems. We hope to have the server back up within an hour to ninety minutes. Thanks for your patience.
Update (4:38AM PST): smtp.forest.net is back up and running.
posted by digital.forest at 03:43 AM on Tuesday, January 15, 2008
Categories: Emergency Maintenance, Mail, smtp.forest.net
At 2am tomorrow morning we will be shutting down the mail server "treehouse" to install new memory in it. We have been seeing RAM-related errors which caused the server some problems last week. We figured tonight would be a good time to bring it down and perform the hardware installation. Downtime should be limited to 15 minutes or less.
posted by Chuck G. at 07:44 PM on Monday, December 31, 2007
Categories: Emergency Maintenance, Mail, treehouse.forest.net
One of our mail servers, treehouse.forest.net, is experiencing problems and is currently down. We are working to restore it as soon as possible.
Thanks for your patience.
Update 11:25AM: treehouse is back up and running.
Final Report 12:30PM: Treehouse was experiencing memory-related errors and when we rebooted it the hardware self-test showed a failed RAM card. This particular server uses RAM in pairs so we had to source a pair of equivalent cards from our inventory to get the mail server running again. We have ordered additional RAM for both treehouse and our inventory and will likely schedule a brief downtime for treehouse over the holidays to perform this work.
posted by digital.forest at 11:22 AM on Wednesday, December 19, 2007
Categories: Emergency Maintenance, Mail, treehouse.forest.net
Starting at around 2AM we started seeing temps rise in the datacenter, more so in DC 1. We've called our HVAC-systems vendor and they should have an emergency technician out here shortly. We've taken steps to mitigate temps in the meantime. We'll post updates as more information becomes available.
UPDATE: 03:55 Our HVAC vendor arrived within a few minutes of the posting above. They've corrected the issue and we are recovering nicely. Things should be back to normal in a few minutes. We'll post an update when we have definitive data on the cause.
posted by Chuck G. at 03:41 AM on Wednesday, December 5, 2007
Categories: Emergency Maintenance
One of our shared hosting servers, souari, is having trouble and is currently down. We apologize for the inconvenience.
Update 10:30AM: souari is back up.
posted by digital.forest at 10:29 AM on Tuesday, December 4, 2007
Categories: Emergency Maintenance
One of our legacy Trident servers, celestial.wwwnexus.com, is currently experiencing technical difficulties and is not serving pages.
We apologize for the inconvenience this presents, and are currently working to bring it back to full operation. We'll edit the support blog when there's progress to report.
Update 9:20AM: celestial is now back online.
posted by digital.forest at 07:29 AM on Monday, December 3, 2007
Categories: Emergency Maintenance
This morning we discovered a hardware failure with callisto.forest.net.
Thankfully all data has been recovered and we have already found replacement hardware. The new hardware is currently running some preliminary tests and checking its hardware, callisto should be back as soon as this is complete. We will update this blog when callisto has fully recovered. Sorry for any inconvenience, and thanks for you patience.
Update 11/21/2007 09:49 PST: The callisto server has returned to operational status and all services of that server should be working properly again.
posted by digital.forest at 09:21 AM on Wednesday, November 21, 2007
Categories: Emergency Maintenance
We will be taking the FileMaker 9 Server named rosemary offline this morning to perform some emergency maintenance on the FileMaker Instant Web Publishing. We do not currently have an ETA for when this work will be completed but we will update this entry as we have more information.
Thank You
11/07/2007 10:25AM: Rosemary is now back online and the error in IWP has been corrected.
posted by digital.forest at 01:34 AM on Wednesday, November 7, 2007
Categories: Emergency Maintenance, Hosting Servers
Last night around 19:30, one of the three primary compressors of one of our two HVAC systems failed. This was noted by NOC personnel who contacted our HVAC contractor who set the controls to bypass the failed unit. We did run on single-stage cooling from one unit for several hours, which lead to both datacenters reaching temperatures in the low-80s F/high-20s C. Once the unit was bypassed 3-stage cooling was achieved and temps dropped to their normal mid-60s F/high-teens C. With outside summertime temperatures in the high-80s F/low-30s C possible we require 4-stage cooling at peak hours.
We will be replacing the failed compressor tonight between 21:00 and 23:00. We have secured portable HVAC units to supplement our secondary HVAC while the primary unit is down for repairs. We will update this website with more information as the repair progresses.
Update: 8:00 AM The new compressor was installed last night. We've resumed normal operations.
Regards,
--Chuck Goolsbee
VP, Tech Ops
digital.forest
posted by Chuck G. at 03:38 PM on Thursday, July 5, 2007
Categories: Emergency Maintenance, Facility Maintenance
We are currently performing emergency maintenance on our Souari server (216.168.37.73). We currently have a team of technicians working on the problem to resolve it as quickly as possible and will update you as soon as we have more information. Unfortunately we do not currently have an ETA for when the server will be back up but we will let you know as soon as it is.
Please accept our sincere apologies for this service outage and please let us know if you have any questions or concerns.
Thank You
Update (10:50 PDT): Apache on the souari.forest.net is online and all websites on the server are accessible at this time. We are still working to resolve an issue with the sendmail functionality on the server and our technicians hope to have that service online soon. Unfortunately we do not have and ETA for the repairs to the sendmail service at this time.
Update (11:44 PDT): Currently FTP on the souari.forest.net server is not available for the same reason that the sendmail service is not functioning on the server. We currently have technicians working to resolve these issues and hope to have an ETA for you soon.
Update (12:32 PDT): To correct a problem with bad library files on Souari, there will be another brief downtime, beginning immediately. We will update again when the server is back up and running.
Update (13:52 PDT): While correcting the problem with the bad libraries on souari.forest.net we discovered an additional problem and have been working to resolve that problem. This has resulted in additional emergency maintenance down time. Please know that we are doing everything that we can to resolve these issues as quickly as possible and have assigned out entire technical department to work on this issue.
Update (17:31 PDT): The Emergency Maintenance on the souari.forest.net server has been concluded and the server has returned to normal operating status.
posted by digital.forest at 09:44 AM on Monday, July 2, 2007
Categories: Emergency Maintenance
The shrubbery.forest.net server has been taken offline in order to perform some emergency maintenance. We have technicians working on the server and will have it back online as soon as possible.
Please accept our apologies for this service outage and please let us know if you have any questions or concerns.
Digital Forest Technical Support
877-720-0483 option #3
Update (10:55 PDT): We have concluded our emergency maintenance on Shrubbery.forest.net and it is now online and functioning normally.
posted by digital.forest at 09:40 AM on Monday, July 2, 2007
Categories: Emergency Maintenance
One of the distribution switches in our facility, specifically in the rack-colocation area in rows 11 & 12 in DC1, has been showing errors and causing some network issues for servers in that area today. Our switch vendor believes that this is being caused by a bad gigabit port which uplinks that switch to our network core. The other possibility is a bad switch engine. Thankfully the former of these has some built-in redundancy. The latter is an easy card swap, and we have spares. So in a few moments we will be manually failing the gigabit uplink to the redundant port. It is unlikely that this will be any more noticeable than the intermittent issues that servers have seen on this network segment, namely some dropped packets and retransmissions. If that does not improve things, we'll replace the switch engine blade later tonight during a maintenance window.
Update 5:00 PM: The card reset seems to have solved the issue. We'll keep an eye on things over the next few days to be sure. We'll also order a replacement card for the switch. Thanks for your patience.

Above: Network Manager Kyle Murray pulls the problem gigabit card from the switch.
posted by Chuck G. at 08:28 AM on Monday, June 11, 2007
Categories: Emergency Maintenance, Network
The source of our HVAC system issue today was a malfunction in our fire suppression system. The fire system is one that is designed to suppress fires without damaging computer equipment. It works by sealing the datacenter and flooding it with a gas which smothers the fire. The first stage in the process, "sealing the room", is done by shutting the air conditioning by using motorized dampers in the ducting.
For reasons we do not know yet, these dampers closed due to some malfunction.
We were able to open them manually and then bypass the fire supression system's control over the HVAC system. By this time however, temperatures in the datacenter were out of the normal range. We elected to shut down as many systems as possible to accelerate the cooling and minimize the heat load.
We will have our Fire Suppression system vendor out to diagnose and correct this malfunction as soon as possible.
The timing of this event coincided with what is predicted to be the last of a streak of relatively hot days here in the Seattle area. We doubt this is linked to the cause, but it certainly contributed to the overall problem and the time taken to recover.
We will continue to update the support blog post below this one with breaking news as it happens. Thank you for your patience during this event.
posted by Chuck G. at 07:41 AM on Sunday, June 3, 2007
Categories: Emergency Maintenance
We are experiencing an HVAC-related issue right now. Our HVAC maintenance vendor is en-route. We may be shutting down systems to lower the datacenter temperature. Please stay tuned for more information.
Update: 12:35 PM Vendor is on site, the system is running again, but datacenter temperature is high. We are shutting down as many systems as possible to reduce the heat load on the facility in order to get the temperature back down as swiftly as possible.
Update: 12:45 PM It appears the HVAC was shut down by a malfunction in our fire suppression system. We have bypassed that system's HVAC controls for the time being to prevent it from reoccurring.
Update: 12:50 PM Temperatures are beginning to come back down.
Update: 1:30 PM We have additional staff on-site now to handle phone calls and server shut-downs/startups.
Update: 2:50 PM We will start bringing shared servers online very soon.
Update: 4:30 PM Datacenter temperature has stabilized and normal. We have restarted most servers, but some still remain down due to miscellaneous problems. If any of these appear to have longer-term issues, we will begin to post individual lists and updates separate from this entry.
Thanks for your patience and understanding.
posted by Chuck G. at 05:27 AM on Sunday, June 3, 2007
Categories: Emergency Maintenance
At 9:50 AM this morning one of our Metropolitan Ethernet providers, OnFiber had an equipment failure here in Seattle. We connect to one of our network peers, NTT/America at The Westin Building via this circuit. This caused us to have have intermittent connectivity over that particular circuit to NTT/America. Some digital.forest clients may have had "slow" or "intermittent" issues reaching servers here for a short period of time while we diagnosed the issue with the NOC's of NTT & OnFiber
We have shut down our BGP connection to NTT/America while OnFiber fixes the problems on their network. At the moment we are running on two of our three network connections. We will update this post when we bring the third circuit back online.
Update: As of 11:02 AM PDT this issue is completely resolved. The OnFiber circuit was manually moved to a different port. After a successful 10-minute testing of the new circuit we turned up our BGP session with NTT/America.
We maintained connectivity to our other BGP network peers through this event, so at no time was our network "down". We do like to keep our clients informed of events here at our datacenter, even if they have no direct impact on your servers. In this case, it was a classic example of Internet Architecture and how it handles outages. The often-used phrase is that it "routes around damage." In this instance when one of our circuits had an issue our traffic just shifted to our other circuits. It is likely that none of our clients even noticed. If they did notice it would have been an intermittent connectivity for a brief period of time. Such is the nature and reason for designing redundant systems. Our fiber optic connectivity to the rest of the Internet flows over multiple physical paths. Those paths do not converge until they are physically inside our datacenter facility. This prevents complete outages through equipment failure or accidental fiber cut. Today's event confirms the built-in redundancies work as designed.
--Chuck Goolsbee
VP Technical Operations
digital.forest, Inc.
posted by Chuck G. at 11:00 AM on Tuesday, May 8, 2007
Categories: Emergency Maintenance, Miscellaneous, Network
Tonight at 8:00 pm we will be performing a software update on our helpdesk system. This will involve approximately 5 minutes of downtime. At that time our online trouble ticket system will be unavailable.
All other support options (telephone, emergency pager, etc) will remain online throughout the maintenance window. The trouble ticket system should be back online by 8:05 PM.
We apologize for any inconvenience this may cause.
posted by Chuck G. at 02:59 PM on Tuesday, May 1, 2007
Categories: Emergency Maintenance
4. Wrap up & Summary
We're pleased to report that the repair on our HVAC system is complete, and finished without incident. The final bit of work required brazing & welding within the unit itself. To mitigate any risk of having the pre-action fire suppression system discharging its gasses, we had our vendor Fire Chief, come out and disable the system. Part of our annual maintenance procedure for the fire suppression system involves the shut down of the HVAC system anyway, so Fire Chief took advantage of the situation to perform that maintenance.

Above: Technicians from Fire Chief perform preventative maintenance on the Fire Detection and Suppression system.
During the HVAC system shutdown, digital.forest staff monitored temperatures in various locations around the datacenter, while our Facilities Manager bounced between the roof and the datacenter monitoring our vendors. Below you can see digital.forest Tech Support member Will Winslow and Facilities Manager Kevin Teker in the darkened datacenter just after the HVAC shutdown occurred. They're carrying their temperature monitors and about to spread out to their stations. You can see the high-CFM fans mentioned earlier today in the open door behind them.

All of our preparation paid off, plus a bit of luck from the weather (it stayed very cool, plus it didn't rain) so that the natural tendency for the facility to warm up was mitigated by the combination of pre-cooling and the fans pulling outside air into the facility. We're happy to report that our highest temperature reached was about what we see here on a "normal" day. Our temporary portable HVAC units never even needed to be turned on.

Interesting conclusions:
Electrical capacity is a hot topic in the datacenter management business these days. There are various rules of thumb concerning the estimataion of power usage split between "floor" (meaning the servers) and "mechanical" (meaning the HVAC systems to cool the servers.) The variable is the delta between outside and inside ambient temperatures. The hotter it is outside, the harder the HVAC systems have to work to chill the inside. We're blessed to be located in a very moderate climate here in Seattle. It rarely gets very hot here. Nor does it get very cold. Our average temperature is actually quite a bit lower than ideal datacenter temperature. Even in summer, it cools enough at night to keep our average right at ideal datacenter temperature. We monitor electricity usage at several points, along the flow for a lot of reasons, but on our main panel in the datacenter we can check at a glance and see how much power is being used in total. The ammeter for example read this way earlier today when we were running the rooftop HVAC and 100% using outside air:

That reads 274 Amps. That is 274 Amps of 3-phase power as it comes in off the grid. Our feed is 2000 Amps so as you can see we have a lot of room for growth with regards to electricity. This is one of the things that really attracted us to this facility when we moved here just over two years ago. With so many datacenter operations running at nearly 100% of their power capacity we felt it important to be able to accommodate our clients expanding needs and requirements. This maintenance interval provided us some real-time data concerning the power needs of our mechanical infrastructure. Those rules of thumb mentioned earlier say "for every 1 amp you feed the floor, you feed the mechanical 1 to 1.75 amps." This seems to have been proven in our experience, but rounded down due to our temperate, if not downright cool location here in Seattle. Here is a shot of the ammeter with the HVAC system shut down completely:

That is 219 Amps of 3-phase power. Looking at our monitoring history, we hit our maximum of 400 Amps last July when we had a week of temperatures in the 90-95° F (32-35°C) range. That means we are running at a roughly 1:1 floor:mechanical ratio in terms of electricity at our peak consumption. If anything we are favoring the floor, which is a great advantage in this industry.
Yet another benefit of colocation at digital.forest in cool Seattle!
posted by Chuck G. at 02:03 PM on Wednesday, March 14, 2007
Categories: Emergency Maintenance, Facility Maintenance
3. Shut Down Interval.
At 12:07 the entire HVAC system was shut down. Datacenter temps are well within reasonable tolerances after 20 minutes on fans alone. We'll update again with more information after the HVAC is retuned to service.
Update: 12:32 PM PDT
HVAC systems are running again. We'll summarize the day's work soon.
posted by Chuck G. at 12:27 PM on Wednesday, March 14, 2007
Categories: Emergency Maintenance, Facility Maintenance
2. Repair Work
Thankfully it has remained nice and cold outside today so our HVAC system, which is designed to use outside cool air if available to reduce compressor load, is running 100% on outside air. This allowed us to continue to run the HVAC while the technicians remove the old compressor and install the new one. So from the perspective of the datacenter things appear no different than a normal day here at digital.forest. All the action is happening up on the roof:


In the top image above the techs wrestle the new compressor up a temporary ramp and into place. In the bottom shot you can see the new compressor in place, and the old broken one on the handtruck, ready to be removed.
The Trane Intellipak is an excellent HVAC system that has a myriad of control options. Below you can catch a glimpse into the heart of the controls, which are usually locked behind a steel panel. We usually interface with these systems via software down in the office, but occasionally it is good to have a look at the atoms represented by the bits.

Above is a close up of the breakers and control units for the compressors. You can see that several breakers are in the "off" position, providing safety for the technicians while they work. Others remain "on" so that the system can still function and provide air handling for the datacenter.

Above: digital.forest Facilities Manager Kevin Teker explains how all of this works.
The next step requires the complete shutdown of the HVAC and Fire Suppression Systems, as the HVAC technicians braze some plumbing. Stay tuned.
posted by Chuck G. at 11:57 AM on Wednesday, March 14, 2007
Categories: Emergency Maintenance, Facility Maintenance
1. Preparations
In our world Electricity is transformed into Bits, with the by-product of BTUs (heat). Our job is to handle (route) bits and manage (cool) BTUs. Despite the fact that we are fairly certain that the technicians can get their work done with minimal downtime of the HVAC system, we are living by the old adage "hope for the best, but prepare for the worst." To that end we have performed the following preparations. We are pretty intimate with our facility and know where the heavy users of electricity are located. We have the "hot spots" identified and covered by portable AC units.


We also have the ability to pull outside air into the facility in large volume, and use smaller local fans to provide ventilation to "warm spots." The outside temperature at the moment is 39°F/4°C, so it is a fairly good day to be performing this task.


This process of course requires a bit of preparation itself. High CFM fans and portable HVAC units are not exactly light users of electricity themselves, and to protect the servers you depend upon we can't just plug them in wherever there's an open outlet. Mechanical motors put variable strains on electrical circuits and it is not smart to put them on the same circuit being used by computers. Therefore we have used building electricity circuits for these devices, rather than the power from our PDUs that feeds the racks. We've taken the extra step to lay extension cords to the various mechanical units, and gaffer-taped those to the floor. Additionally our Facilities Manager has diagramed the circuits and breakers involved in feeding the mechanical units and calculated the amperage loads so we can avoid popping breakers.

We have also deployed some temperature probes in critical locations to monitor the ambient temperatures in "cool rows" to see what the intake air is like for servers. Finally, overnight we dropped the datacenter temperature several degrees below our normal 65°F/18°C to provide some "breathing room".


More info coming soon.
posted by Chuck G. at 09:58 AM on Wednesday, March 14, 2007
Categories: Emergency Maintenance
Last week we had a single compressor unit in our Trane Intellipak cooling system fail during an unseasonably warm day. The system has built-in redundancies to handle such situations so we recovered quickly from the condition. In order to prepare for the warmer weather coming soon, we have elected to replace this failed unit now. So tomorrow (Wednesday, March 14) we will have a vendor here replacing the compressor. This will involve occasional, brief shutdowns of our HVAC system.
We have brought in industrial sized high-CFM fans, to maintain air circulation in the facility during the maintenance. Additionally we have several portable 1-ton HVAC systems which we can deploy on an as-needed basis should any areas of our datacenter exceed standard temperatures. We have deployed temperature probes throughout the datacenter to monitor this as the maintenance progresses. As such we are confident that this event will have minimal-to-no impact on operations, since we will be prepared to mitigate any heat issues should we see temperatures rise.
We apologize for the short notice, and we hope you understand the reasons why. We strive to maintain our facility to the highest standard, as well as keep you informed as we take steps to do so. We will post updates throughout the day tomorrow.
Chuck Goolsbee
VP, Technical Operations
digital.forest, Inc.
posted by Chuck G. at 10:22 PM on Tuesday, March 13, 2007
Categories: Emergency Maintenance, Facility Maintenance
The server www.ninewire.com is currently experiencing hardware problems. We are working on the server and will have the issues resolved as soon as possible.
posted by digital.forest at 08:58 PM on Monday, November 13, 2006
Categories: Emergency Maintenance
Sorry for the wait everyone.
Celestial is back up and running. All webpages should be working again.
posted by digital.forest at 02:04 PM on Friday, November 3, 2006
Categories: Emergency Maintenance
We've been experiencing problems with our "Celestial" server for the past few days, and today it's had to come down for hardware maintenance. Any user accounts hosted on celestial will therefore be unavailable until the maintenance is completed. Our apologies for the inconvenience, we'll try and have it back up as soon as possible.
posted by digital.forest at 09:25 AM on Thursday, November 2, 2006
Categories: Emergency Maintenance
The hosting server Titan is currently down for maintenance. We expect to have all of the sites operational by mid-morning on October 10th.
posted by digital.forest at 12:36 AM on Tuesday, October 10, 2006
Categories: Emergency Maintenance
Most accounts have now been moved to the new hardware and are functioning normally. A few accounts are still experienceing some difficulties and we are working hard to find a resolution to each of their unique problems.
If you find that your site is is available but not working properly, please submit a trouble ticket through http://www.forest.net/helpdesk providing the main URL and the URL and functionality that is not working properly. We will continue to give these issues top priority until all of the sites on Sabre are working properly again.
posted by digital.forest at 12:38 AM on Friday, June 23, 2006
Categories: Emergency Maintenance
The Windows ColdFusion and IIS hosting server Sabre is going down for emergency maintenance. We will be taking two maintenance windows today to improve the performance of the server. The first window will be at 10:30 AM Pacific and will last for approximately 15 minutes. The second will take place at 2:00 PM Pacific and will last for approximately 15 minutes.
We're working hard to improve the performance of this server and hope to see noticable differences after the second maintenance window.
posted by digital.forest at 10:22 AM on Tuesday, June 20, 2006
Categories: Emergency Maintenance
Mars is currently down for an emergency maintenance due to a failed hardware component. Unfortunately at this time Mars is still down due to this problem. If you are a former Trident/Speedyweb customer and your website and email are currently not working, you are affected by this problem.
Update @ 3:30am PDT: The proper technicans have been notified and we are currently evaluating the situation in order to get this resolved as soon as possible. We anticipate that Mars will be back online around 9:30 AM PDT.
Update @ 8:30am PDT: We are executing our plan to get the machine fully back up.
posted by digital.forest at 10:50 PM on Wednesday, June 7, 2006
Categories: Emergency Maintenance, Hosting Servers
Update @ 12:30 PM PDT: Neptune is now back online
Currently Neptune.wwwnexus.com is down due to a catastrophic hardware failure. We are currently moving it over to new hardware and expect that Neptune will be fully operational by 2:30 PM PDT.
-digital.forest technical support
posted by digital.forest at 09:40 AM on Wednesday, April 26, 2006
Categories: Emergency Maintenance
Final Update:
Mango is now back online. All files are recovered and it has received a major upgrade from what the hardware was located on. If you are hosted on Mango and you are experiencing problems, please notify our technical support immediately. -Yvo
UPDATE:
Mango suffered a catastrophic hardware failure. We are currently finding alternative hardware for the server now.
Currently, mango.forest.net is down and is being repaired. downtime should not be more than 1 hour.
posted by digital.forest at 01:37 PM on Friday, April 7, 2006
Categories: Emergency Maintenance, Hosting Servers
Currently FTP service is disabled on Souari as we are in the process of moving over the files to another service. In light of yesterday's events we have decided to rebuild Souari and get rid of any problems it had. Web Service is NOT be affected by this, only the FTP service (the service that allows you to make changes to your website) is affected while this move takes place.
If it is an absolute emergency (such as your website is broken) we will stop the process. We apologize for this major inconvenience however we are doing this to ensure the privacy of your data.
Please monitor this website for any updates concerning this situation.
posted by at 10:50 AM on Wednesday, March 29, 2006
Categories: Emergency Maintenance, souari.forest.net
Neptune.wwwnexus.com continues to be down at this time. We have replaced the power supply, however we think the power supply failed while the hard drives were writing to the drives. Because of the 'dirty' shutdown by the power supply, some of the services (such as the web service) are not starting up.
We are working hard on resolving this situation and this time can't give an estimated time of repair.
Please stay tuned for any further updates.
posted by at 06:04 PM on Friday, March 10, 2006
Categories: Emergency Maintenance
At 1:30 PM PST the machine neptune.wwwnexus.com shut down due to a failed power supply. We are currently replacing the powersupply and testing the machine's file system for any problems.
We anticipate that Neptune will be back online at approximately 2:45 PM.
posted by at 02:08 PM on Friday, March 10, 2006
Categories: Emergency Maintenance
We will be taking sage.forest.net (FileMaker 7 Server hosting) down in a few minutes in order to diagnose a problem we are experiencing with the machine.
The machine should be back up by 5:45pm PST
posted by at 04:53 PM on Friday, February 17, 2006
Categories: Emergency Maintenance
On Thursday night, catalpa.forest.net (smtp.forest.net) suffered a hard drive failure at approximately 7pm PST. Currently we are spooling the mail for the accounts hosted on this machine on another mail server and we have pointed smtp.forest.net to another mail server located at digital.forest. In the meantime we are migrating all the affected accounts over to another mail server.
Accounts on other email servers such as treehouse, palm (infoasis) and ninewire are not affected unless their mail client's outbound (SMTP) server is set to smtp.forest.net. If this is the case please change the outbound SMTP to the same as your incoming (POP3 or IMAP) mail server. Be sure to enable SMTP authentication in your mail client when you make this change.
We currently do not have an estimated time of repair, but we anticipate that we will have the affecting customers moved over quickly.
If there any updates concerning catalpa.forest.net we will post them here as soon as they are available.
digital.forest technical support
posted by digital.forest at 01:34 PM on Friday, January 27, 2006
Categories: Emergency Maintenance, Mail, catalpa.forest.net
At approximately 1:45 AM PST this morning we experienced a power overload condition on a single electrical circuit, which services half of two racks in our facility. This tripped a breaker in one of our Power Distribution Units. We reset the breaker and used power monitoring equipment to measure the load on that circuit as servers rebooted. With the data collected rerouted some power cables in those two racks to spread the load in a manner which should prevent this from happening again.
Most of the servers affected are digital.forest shared hosting servers, however one of the two racks contained some colocated and one dedicated server. We will be contacting the affected clients during the business day with a follow-up.
Server downtime was limited to about 20 minutes maximum, with most servers being down less that 15 minutes.
We apologize for any inconvenience.
Chuck Goolsbee
VP Technical Operations
posted by Chuck G. at 02:48 AM on Friday, December 30, 2005
Categories: Colocated & Dedicated Servers, Emergency Maintenance, Hosting Servers, Miscellaneous
We are currently troubleshooting a performance issue with Lasso on Banana and appreciate your patience.
---update---
Banana has stabilized.
posted by at 10:50 AM on Saturday, December 10, 2005
Categories: Emergency Maintenance
Treehouse will be shutdown at 10pm PDT and we will be migrating all of the users to a new server on a different platform.
Stay tuned for updates by monitoring this space.
Thank you for your continued patience
digital.forest technical support
posted by digital.forest at 04:36 PM on Tuesday, November 1, 2005
Categories: Emergency Maintenance, Mail, treehouse.forest.net
The mail server is currently back online and churning through the queue that has been build up due the downtime. It is still performing poorly at this time.
Connections via your mail client (such as Eudora, Outlook, Entourage, etc.) will most likely not be working at all due to the state of the mail server.
Right now your best bet will be to access the web mail portion by going to http://mail.(yourdomain) and then to log in using your email username and password. This may not as well, but will be more reliable then using a mail client.
Tonight we will replace the mail server, treehouse.forest.net, with a different mail server in order to restore complete functionality to our email.
We thank you for your patience,
digital.forest technical support
posted by digital.forest at 01:49 PM on Tuesday, November 1, 2005
Categories: Emergency Maintenance, Mail, treehouse.forest.net
In the next 30 minutes we expect to have an update from our email system administrator on a potential return to service time.
More information will be posted to this website at that time.
While message loss is a concern to everyone, we don't anticipate any significant email losses.
Thank you for your continuing patience,
digital.forest technical support
posted by digital.forest at 12:32 PM on Tuesday, November 1, 2005
Categories: Emergency Maintenance, Mail, treehouse.forest.net
In our continuing effort to get our mail server, treehouse.forest.net, fully back online, we will be taking it off line for emergency maintenance. It is down until further notice.
Please monitor this space for further updates.
Thank you for your continuing patience,
digital.forest technical support
posted by digital.forest at 11:33 AM on Tuesday, November 1, 2005
Categories: Emergency Maintenance, Mail, treehouse.forest.net
Update: 8:45 AM Mail.ninewire.com will be fixed next. Downtime should be around 15 minutes, starting at 9 AM PST.
Update: 8:25 AM Treehouse is back up. Still some issues to work out (TLS/SSL related) stay tuned.
We are carrying out emergency maintenance on two of our five mailservers over the next few minutes. This is to address the problem which we noted last night which was crashing the server. "Treehouse.forest.net" will be the first, with "mail.ninewire.com" being the second. We apologize for any inconvenience this may cause, but these problems require immediate attention in order to provide stable and reliable email service going forward.
posted by Chuck G. at 08:03 AM on Tuesday, November 1, 2005
Categories: Emergency Maintenance, treehouse.forest.net
A network configuration change made during last night's scheduled maintenance has caused a minor issue with clients behind our shared firewall service. A reboot of the firewall will clear this issue and allow proper functioning again. We will reboot the firewall at noon PDT today. Downtime should only be a few seconds while the firewall restarts.
This should have no affect on the rest of our clients or network.
Update: 11:15 AM PDT The firewall reboot has been cancelled as it is no longer required. We have addressed the issue with other means.
For the terminally curious here are the details:
Last night just after midnight, as part of our plan for dramatically increasing our levels of network redundancy, we migrated one of our upstream fiber connections to our second boundary router. We also finished enabling Spanning Tree Protocol on all of our Ethernet switches to recognize redundant trunks we will be deploying in the coming weeks.
In this case it was the gigabit Ethernet connection from XO Communications (AS 2828), that we moved from our original Cisco 6509 router, over to our secondary Cisco 6509 router.
When we did this, all network operations appeared to acknowledge the change via iBGP and OSPF protocols as expected.
Unfortunately our managed firewall device did not. We began to get calls from some clients concerning reachability of certain servers around 6:30 this morning. By 7:30 we had isolated the problem to the change made last night, and actually shut down the gigabit connection to XO to guarantee connectivity to shared firewall clients while we worked out how to address this problem with minimal downtime for the affected clients.
We planned a config change and reboot of the firewall for noon today, but in the meantime we were able to forestall that action by redistributing static routes between the firewall and the two different routers via OSPF.
That action was completed at 11 AM PDT today, and should prevent any such future routing issues like we experienced last night.
Please be aware of the following:
This did not mean that servers were "down." The firewall remained up, and all servers behind it were reachable via normal network channels. The issue was that if OUTBOUND traffic from those servers was destined for the XO connection, then the firewall had the incorrect routing information and was unable to send it. XO carries approximately 20-30% of our outbound traffic.
posted by Chuck G. at 10:24 AM on Thursday, October 27, 2005
Categories: Emergency Maintenance, Managed Firewall Services, Network
The Neptune server has been taken down for emergency Maintenance. It will likely be down the remainder of the night and returned to service by 9AM Pacific Time.
We appologize for the inconvenience and hope to avoid further issues through this proceedure.
posted by digital.forest at 11:08 PM on Wednesday, October 19, 2005
Categories: Emergency Maintenance
We have learned of a bug, which has been confirmed by FileMaker Inc. that involves FileMakerPro Advanced Server Seven running on a dual-CPU machine. Sage.forest.net is a twin-CPU Xserve and it is experiencing very poor perfomance under high load due to this bug. Thankfully we have a single CPU Xserve chassis that we can swap into sage. We will be shutting down sage.forest.net for about 5 minutes within the next 30 minutes to address this issue.
Thanks for your patience while we perform this much needed work.
Update: 13:40 PDT.
This work has been completed.
You can read more info about this particular bug here on FileMaker's support FAQ website. Their suggestion for turning off a CPU in firmware seemed a bit over the top for us, especially when we have spare hardware at hand and the Xserve lends itself well to component swapping. Additionally we added 512 MBs of RAM to sage while we had it open.
Again, thanks for your patience while we addressed this issue.
posted by Chuck G. at 01:09 PM on Thursday, October 6, 2005
Categories: Emergency Maintenance
|
|