ups failure analysis

how to use ups:ups failure analysis

When the UPS system fails, you should first find out the cause, and distinguish whether it is the load or the UPS system, the host or the battery pack. Although the UPS host has a fault self-check function, a lot of analysis and testing work are still needed to repair the fault point.

When the UPS of the important load fails, do not shut down blindly, but use its bypass to continue to supply power, and then find out the cause and determine whether the fault occurs in the load or the UPS system. If the UPS system fails, further judge whether the fault is in the UPS host or in the battery pack.

In addition, if the self-test part fails, the displayed fault content may be wrong. For the failure of the host computer to breakdown, break the fuse or burn the device, it is necessary to find out the cause and eliminate the failure before restarting, otherwise the failure of the UPS system will further expand.

it is necessary to find out the cause and eliminate the failure before restarting, otherwise the failure of the UPS system will further expand

Troubleshooting method

UPS equipment is an uninterruptible power supply, and its load requires that the power supply cannot be interrupted. Once the UPS has a problem, it will inevitably affect the connected load. Therefore, UPS operation and maintenance personnel must be familiar with the equipment used. It should be able to deal with its common faults in a timely manner. The following introduces several basic methods for troubleshooting.

Observation method

The observation method is the simplest, most direct and most practical method in equipment troubleshooting. This method is to use human senses to initially judge and eliminate equipment failures. When using the observation method to detect failures, it can be from simple to complex, from external to internal, and then based on appropriate experience accumulation, focus on fault-prone circuits or parts. Inspection is bound to have a multiplier effect. The observation method focuses on the four aspects of seeing, listening, smelling and touching.

“Observation” is the first step of contact failure, which includes: checking whether there is a short circuit caused by foreign objects overlapping in the equipment. Second, checking whether there is an open circuit in the equipment, or whether there is a seriously damaged or broken wire, pay special attention to the wire or Whether the insulating parts of other components are seriously damaged. Third, check whether the components are deformed or cracked. Fourth, check whether the connecting screws of the equipment are loose, and whether the contact of each connector is good. Darkening or smoking conditions.

The content to be seen also includes the pull-in of contactors and relays, the light and dark conditions of various indicators, the bursting of electrolytic capacitors and AC capacitors, etc. A considerable part of the fault can be found by looking at it, so that the equipment can be restored to work as soon as possible.

“Listening” is to listen to whether there is any abnormal sound during the operation of the equipment. During the operation of the UPS equipment, the contactor and relay will make a sound, and the transformer, reactor and other live coils will make a unique sound when working. These sounds There are certain rules and rhythms. If these components fail, their sounds will change. According to these changes, the UPS equipment failure can be found.

“Smell” is to use human olfactory organs to sense equipment within a certain range. For example, after the insulating outer layer of conductors or some components is scorched or peeled off due to excessive passing current, a pungent and unpleasant odor is produced.

Therefore, within a certain range, this abnormal smell can remind the maintenance personnel that there is an abnormal situation in the operation of the equipment, and the fault can be eliminated by carefully looking for the source of the strange smell. “Touch” is to use hands to contact some components that are prone to failure or components that may cause failures after judgment, so as to help operators and maintenance personnel find and eliminate failures.

For example, for suspected soldering or cold sweat faults, it can be found by lightly touching or shaking, and the perception of temperature can be determined by experience. There are many heating elements in the UPS. When the equipment is working, current flows, and the temperature of some components will rise. As long as the temperature does not exceed the allowable value, the working state of the equipment will not be affected.

As long as the temperature does not exceed the allowable value, the working state of the equipment will not be affected

However, if it is found that a certain component is overheated and the temperature suddenly rises, the equipment may have failed or will fail. At this time, we can find the fault according to the heat source and analyze the cause. “Touching” requires accumulated experience in normal work, but it is not allowed to tamper with it. For places with high temperature, high voltage or other inconvenient touch, do not tamper with it easily to prevent personal injury or affect the normal operation of the equipment.

In short, the observation method can directly and quickly deal with the fault. The “observation, listening, smelling and touching” methods are not independent of each other, and can be used alone or in combination. For example, due to short-circuit current If it is too large to cause a certain wire to be scorched, the change of its color can be found by looking, or the transmission of peculiar smell can be sensed by smelling, and the source of the fault can be found.

The last thing to note is that some circuits with a high failure rate should be observed.

Graded compression test

In the UPS troubleshooting, although the observation method is more effective, there are many faults that cannot be found by the observation method, and must be judged by certain testing methods, in order to find out the cause of the fault and take necessary compression methods, so as to correct the fault. Narrow down until troubleshooting.

Hierarchical compression is to analyze and judge some faults whose scope is not clear, and then test them step by step to compress and find them. There are different compression methods depending on the type of equipment failure or individual maintenance methods. For example, there are middle score + resistance measurement method, middle score + pressure measurement method and so on.

These methods are inseparable from both static and dynamic methods. Static resistance measurement method is used to check whether the circuit is normal, and finally the fault is compressed to a point; dynamic is used to detect whether the voltage and waveform of the relevant circuit are normal when the equipment is working to compress the fault. Both of these maintenance methods increase the maintenance speed by adopting the segmented compression method.

The maintenance method using subsection compression is to analyze and determine whether the relevant circuit is abnormal by testing some voltage values, current values and waveforms. Therefore, the maintenance personnel should be able to memorize and correctly test the normal waveform, voltage value and current value range of some important working points.

The general important working points are: AC power input, input (out) end of rectifier, input (out) end of inverter, output end of filter device, output end of battery, high-power switching device and each printed circuit The working power supply, etc. followed by the relevant voltage points and waveform points in each control circuit, compare the test situation with the normal value, and then analyze the scope of the fault. Troubleshoot faster.

For the test of voltage value, current value and waveform, if the normal value is unclear, if conditions permit, the voltage, current and waveform of the corresponding point of the same equipment that is working normally can be tested. The two are compared to analyze and judge the fault.

Other special methods

For some special fault phenomena or fault points, it is difficult to eliminate them by ordinary methods, so we can use some special methods to check and compress. For example, some poor contact caused by cold welding, virtual welding, contact wear, oxidation, rust, etc. can easily cause the UPS equipment to run good and bad. When it is abnormal, the maintenance personnel check and compress it there, and the fault may suddenly disappear. , For intermittent failures like this, it is generally more effective to use the tap vibration method to check.

some poor contact caused by cold welding, virtual welding, contact wear, oxidation, rust

In the tapping vibration method, it should be noted that the tapping vibration should not be too heavy. If there is doubt about a certain part, it should not cause excessive vibration in other places due to the tapping of this part, and it should not cause new failures due to excessive tapping. It can only be carried out moderately. When knocking, do not use metal objects to prevent scratches or cause short circuits.

The UPS equipment may fail due to the increase of the ambient temperature or the increase of the operating temperature inside the equipment itself. After a period of shutdown, the equipment can be restarted as the temperature decreases, and the equipment can work normally again. However, as the working temperature increases, the equipment fails again.

For such abnormal phenomena due to temperature rise, the best maintenance and compression method is to use the local heating method. The local heating method refers to the troubleshooting and compression of the equipment. , For some doubtful parts, use heating means, artificially raise the temperature for inspection, the heat source can be a safety heating body or a hair dryer, etc. When local heating, if the system fails, it indicates that the thermal performance is bad. The element is in the heating area. When heating locally, the temperature should be properly controlled. It should not be too low, and the effect will be poor; it should not be too high, which may damage the components and cause new failures artificially.

For some complex faults, especially when the internal working principle of some components is not well understood, when the inspection is compressed to one level or a certain section, it is suspected that some components are faulty, then we can use the alternative method to judge , the substitution method applies to the same printed circuit board, the same circuit unit or a specific component.

For example, when there is doubt as to whether a certain component is normal or not, it can be replaced with components of the same type and specification. To see if the equipment failure phenomenon disappears, for some components that are difficult to determine the model specification, we can use similar components instead, and if the failure phenomenon can be eliminated, it means that the failure is indeed here. The substitution method has a very wide range of applications, especially those imported UPS equipment that have no reference circuit drawings and no specific parameter indicators.

There are various maintenance methods for UPS faults. It is impossible to summarize them in detail here. It is up to everyone to continuously improve them in the actual troubleshooting process, and summarize a set of maintenance methods suitable for their own way of thinking.

UPS common faults

Under normal circumstances, the common faults of the UPS motherboard include non-parallel, non-inverting, non-stabilizing, non-charging, inability to use mains power, battery fault, capacitor fault, interference fault, and crash. When overhauling the UPS, the battery should be checked first, followed by the motherboard circuit. When it is determined that the main board circuit is faulty, the mains voltage regulator power supply circuit should be checked first, and then the inverter circuit should be checked.

When overhauling the UPS, the battery should be checked first, followed by the motherboard circuit

Invariant

Non-inverting means that the UPS works normally with mains power, but the DC voltage of the battery cannot be converted into 220V (or 380V) AC voltage when the mains is interrupted. In this case, the battery voltage should be measured first, because if the battery voltage is too low, the control circuit will interrupt the operation of the inverter circuit after detecting the low battery voltage signal.

Next, check whether the auxiliary power supply is normal and whether the inverter tube and drive tube are damaged. Finally, check the output protection circuit. Under normal circumstances, the fault point of the UPS can be checked and eliminated through the above steps.

Unregulated voltage

For the off-line UPS, the unregulated voltage is divided into two situations: the output is not regulated when the AC input is input and the inverter output is not regulated. When the commercial power is input, the output voltage regulation process is realized by connecting the relay and the different taps of the transformer through the voltage regulation circuit.

The voltage regulation process of the inverter output voltage is realized by controlling the pulse width of the square wave signal by detecting the feedback voltage of the inverter. If the UPS fails to stabilize voltage, just check the corresponding voltage regulating control circuit.

Not charging

The non-charging fault is difficult to find in an environment where the mains power is not frequently interrupted, and it is very harmful, and it is likely to cause the battery to be scrapped in advance because it cannot be charged for a long time. The method of judging this fault is very simple, as long as the connection between the charging circuit and the battery is disconnected, and the no-load voltage of the charging circuit can be judged.

When it is normal, the voltage of a single 12V battery is 13.5V, and the two batteries connected in series are 27V. If the voltage is abnormal, the charging circuit and the corresponding control circuit should be checked, especially the related control circuit. ; When the mains voltage is too low or interrupted, the charging circuit will stop working under the action of the control circuit; if the control circuit malfunctions and malfunctions, the charging circuit will not work.

Can not use mains electricity

The inverter output is normal, and there is no output when the mains input is used. When encountering such a fault, the mains detection circuit should be checked first, because when the mains detection circuit detects that the voltage of the mains is too low or too high, it will send a corresponding signal to the control circuit, so that the control circuit sends a control pulse to cut off the mains. Power input channel, and make the UPS in the inverter state.

When the detection circuit is normal, finally check the relay conversion circuit

When the detection circuit is normal, finally check the relay conversion circuit. Due to different models, its control relationship and protection circuit types are also very different. There are many reasons for the same fault phenomenon here, but according to experience, the inspection methods are basically the same.

UPS cannot start normally

Under normal circumstances, the online UPS will automatically work in the bypass power supply mode as long as the input switch is closed, and the load is directly supplied by the mains. When the UPS starts for a period of time, it will automatically switch from bypass power supply to inverter power supply (normal working mode). If it can’t start normally, it means there is a problem with the battery or the inverter. Check the battery or the inverter to find out the reason.

In addition to the internal factors of the machine, the reasons why the UPS cannot start normally should firstly check whether the input voltage is normal, and for the UPS with three-phase input, also check whether there is “phase loss”. Because there is a detection circuit inside the UPS to monitor the input voltage in real time, if there is a “phase loss”, the three-phase average value of the input voltage must be lower than the lower limit of the normal value, and the detection circuit will send a signal to block the normal start of the UPS.

If the input voltage is normal and the UPS still does not start normally, then for the single-phase input UPS, check whether the live wire and neutral wire of the input voltage are connected reversely, for the UPS with three-phase input, check whether the phase sequence of the input voltage is correct.

UPS frequently switches to bypass mode during operation

There are usually three reasons for the normal operation of the UPS to go to the bypass state: First, the UPS itself is faulty. Second, the UPS is temporarily overloaded. The third is overheating. For example, when the UPS has a heavy load, and then other loads are started, the UPS will transfer to the bypass due to “overload”. After the load impulse current passes, the UPS will automatically switch to the normal working mode. The frequent occurrence of this situation is detrimental to the stable work of the UPS, and should be dealt with accordingly.

For example, the load current of the microcomputer at the moment of power-on is relatively large, and with the extension of the power-on time, the load current gradually tends to the normal value. After calculation, the load current of the microcomputer at the moment of startup is about 2-3 times that of normal operation.

Such a control method will inevitably cause the UPS to overload and switch to the bypass at the moment of loading. In order to avoid it from happening, the load should be gradually increased under the normal working condition of the UPS to disperse the inrush current that the load starts at the same time.

In addition, comparing the ambient temperature and the temperature displayed on the UPS display screen can help to determine whether the UPS is frequently switched to the bypass power supply mode due to abnormal temperature.

whether the UPS is frequently switched to the bypass power supply mode due to abnormal temperature

When the utility power is interrupted, the UPS will also automatically stop

When the utility power is interrupted, the UPS shuts down immediately because the battery cannot maintain the power supply to the load, thus causing the load power supply to be interrupted. At this time, due to the failure of the battery or the serious deterioration of its performance, the battery does not have enough energy to maintain the power supply to the load when the mains is interrupted. When checking the battery, the quality of the battery cannot be measured by measuring the level of the terminal voltage when the battery is no-load, but the battery should be slightly loaded, depending on the change of its terminal voltage.

When the battery fails or its performance is seriously deteriorated, although its no-load terminal voltage is basically normal, as long as it is discharged, its terminal voltage will drop significantly, and the drop often exceeds the allowable range of the battery. When checking the battery, the load value of the battery is related to the battery capacity. It is recommended to take 70% of the rated capacity of the battery as the discharge current value.

UPS human failure

The so-called “human failure” refers to the failure caused by people’s inappropriate behavior or the illusion that some are real failures, while some are not failures at all. The human failures of UPS can be roughly divided into: suspected failures, experience failures, knowledge failures, handover failures, operational failures, environmental failures, delay failures, selection failures and maintenance failures.

Suspected failure

As the name suggests, the so-called suspected failure is not actually a failure. This situation is usually caused by the illusion of the staff on duty. For example, when the dual-machine parallel redundant UPS system is working normally, the attendant opens the front door of the cabinet and suddenly finds that two lights on a control panel are on, but in his impression, one should be on, so he thinks that the UPS is faulty. In a hurry, the inverter of this UPS was turned off, and the manufacturer was urgently invited to repair it.

After the maintenance engineer arrived, he found that everything was normal after the UPS was started. The duty officer thought that the faulty UPS control panel still had two indicator lights on. It turned out that the UPS should have two lights on when it was running as the main engine. Just lit a light. The occurrence of this situation is mostly due to the absence of the original trained watchman, or the rotation of the watchman with the untrained watchman.

Knowledge failure

The emergence of such situations is mainly due to the lack of basic theoretical knowledge of some users. There are many such examples, and the user also regards it as a real fault and requires the merchant to repair or compensate. For example, a three-phase 30kV·A UPS found that a power module in the equipment was burned out.

So the user confirmed that it was caused by the “zero-point drift” of the three-phase voltage of the UPS, and negotiated with the UPS merchant, saying that the UPS had a problem and needed to be checked or replaced immediately. Since users raise such serious issues, suppliers must take them seriously. After checking the three-phase output voltage of the UPS, it is 220V, 219V, 219V, the symmetry is very good, and the zero point does not drift. The damage of the power module of the new equipment is because of its quality problem, and there is no problem after replacing it.

Generally speaking, the three-phase voltage difference less than 2% can be ignored. At present, most UPSs have the ability to automatically adjust the phase voltage unbalance less than 2% when the three-phase load is 100% unbalanced. The so-called 100% unbalanced three-phase load means that one or two phases of the three-phase output of the UPS are fully loaded (that is, each phase is full of 1/3 of the rated capacity of the UPS).

While the other two or one phase is unloaded. For example, for a 30kV·A UPS, the full load value of one phase is 10kV·A, which is not as some people understand: the current of one phase is 1A and the current of the other phase is 2A, and their unbalance is considered to be 50%. Actually it cannot be understood that way. On the other hand, if the current of one phase is 1A and the current of the other phase is zero, then their unbalance is also 100%! This is true literally, but not by definition.

Another example is that some users have equipped the UPS with batteries with a working life of 3 to 5 years, and the temperature of the electricity environment often exceeds 30°C in summer, and the utility power has never been cut off after more than two years of use. It has never been discharged, and the condition of the battery can be imagined. Just at this time, the mains power outage, and the discharge time of the battery is less than 1/3 of the rated time. This is mainly because the user does not regularly maintain the battery in accordance with the relevant requirements in the UPS operation manual.

Operation failure

For the safe and reliable operation of UPS, various products have their own set of safe operation procedures, which are written into the manual for users to refer to. Follow this procedure to ensure the safe operation of the UPS, otherwise there may be problems. Some users operate arbitrarily according to their own understanding, and sometimes there are problems. There are also unintentional operation failures, such as accidentally breaking a neighboring device when disassembling a device during maintenance. A secondary fault occurs.

When checking the fault, the test lead probe for measurement mistakenly short-circuits some two points; when connecting the external battery, the positive and negative poles are mistakenly wrong, which will destroy the inverter. One or several battery connection terminals are not tightened or the battery switch is forgotten It is closed, and the battery will not be discharged when the mains fails. If the input/output connection is not tightened, it will cause the false appearance of the AC failure. The power supply bureau will make the original phase sequence wrong when renovating or repairing the line, which will cause the UPS Unable to start or switch failed.

Forgetting to start the inverter after the UPS is powered on, it will cause a shutdown failure when the mains is cut off; the on-duty personnel will throw food in the machine room or nearby the duty room, which will attract mice, gnaw on the cables or break into the machine, causing failures. Or because the chassis is not tight, the lizard drills into the machine and short-circuits some parts of the circuit board, causing a failure. The unshielded remote signal cable runs in parallel with the AC power line, and coupling interference occurs and causes failures.

Forgetting to start the inverter after the UPS is powered on, it will cause a shutdown failure

Delay failure

Due to the negligence of the on-duty personnel, the signs of failure were not found in time, or the failure caused by not being dealt with in time was found. That is to say, if such failures are discovered and dealt with in time, subsequent failures can be avoided.

For example, in a UPS dual-machine parallel redundant system, the load is evenly divided between the two machines. Some UPSs sometimes cause one of the inverters to shut down due to a coincidence of certain conditions, and the load is transferred at this time. On the UPS that is not shut down, this situation will be displayed on the panel. If the duty officer finds out in time, just press the start button of the shut down inverter again to start the inverter.

If the on-duty staff does not find out in time, when the mains power is interrupted, it will become a stand-alone power supply, firstly, the overload capacity will be weakened, and secondly, the battery backup time will be halved (when each UPS has batteries of the same capacity). Overloaded, all power will be cut off.

Another example is that when the battery is running under unsatisfactory conditions, especially if it has not been charged and discharged for a long time, monitoring should be strengthened. Once a battery with a significantly reduced capacity is found, it should be replaced immediately. Another example is that the fuses and connectors in the vehicle or ship UPS are easy to loosen due to constant vibration, resulting in failure.

Since the fuse is in power-on operation for a long time, it softens due to heat and sags and bends in the middle when it vibrates. At this time, if it is not replaced in time, it may break at any time, resulting in undue failures.

Maintenance failure

Regular maintenance of the UPS is necessary, but such maintenance should have a set of strict procedures.
Failure to maintain the machine regularly or irregularly as required is an important cause of failure. For example, some UPSs are not maintained for a long time, so they find that the machine is unstable or shut down or cannot be started, so they have to request repairs.

After opening the case, the dust mixed with conductive ions filled the whole machine, covered the circuit board, mixed with water vapor, and destroyed the normal working state of the circuit. After cleaning these foreign objects with a hair dryer, the machine returned to normal.

On the other hand, there are also problems in maintenance: after the UPS is repaired, the maintenance personnel add AC mains to the input terminal, but forget to start the inverter or close the battery switch. When the next mains power failure occurs, the UPS will The inverter does not start or shuts down without battery power.

When a considerable part of the battery pack has been damaged due to environmental reasons, the user improperly replaces only the obviously faulty batteries. When the next mains power failure occurs, one of the unreplaced batteries may be disconnected and cannot be discharged. Electricity, or even if it can discharge electricity, the capacity of the old battery will be greatly reduced due to the influence of the old battery. The capacity of the new battery will not be fully utilized, but will shorten the life. After replacing the battery, forget to tighten the connection end , it will not be able to discharge electricity when it is used.

If tools, components, screws, nuts, washers are forgotten in the machine after repair, or the plug socket is not fastened, it will lead to failure.

If tools, components, screws, nuts, washers are forgotten in the machine after repair, or the plug socket is not fastened, it will lead to failure

Experience failure

Experience is a valuable asset, but when solving problems, you should also analyze the specific situation in detail, otherwise generalizations based on experience alone will often lead to problems. For example, once a user with experience in UPS of brand A went to operate a UPS of brand B, without reading the manual, he started DC startup, because all UPSs he was familiar with could be started with DC, even if he couldn’t start it, he had a way.

However, this machine itself does not have a DC start function, of course it can’t start, so he opened the case and used a screwdriver to see the relay and stabbed the inverter. The inverter started to start, but immediately there was a white smoke, and the power tube was damaged like this. . He did not know that the UPS with DC start function has a certain procedure when starting: after the DC start switch is turned on, the working state of the control circuit is first established, and after it is normal, the inverter power tube is driven to achieve the purpose of normal start. .

However, because there is no DC start function here, the inverter is also starting while the control circuit is in the process of establishing the working state. The unstable state of the transition process causes the two power tubes on the same arm of the inverter to be turned on at the same time and burn out.

Handover failure

This type of failure is mainly caused by the ignorance of the staff on duty. For example, there is a securities company that purchased a UPS with dual-machine hot backup connection, with a total delay of 4 hours (that is, the delay of each UPS is 2 hours).

In less than a year, the company moved the system without notifying the UPS manufacturer without authorization. Disassembled into two single UPS for use, so that the original 4h delay has become a single 2h delay. Shortly after, an extended mains power outage cost the company significant losses.

Leave a Comment

Your email address will not be published. Required fields are marked *

tycorun logo

TYCORUN ENERGY

We offer lithium ion battery products, solutions, and services across the entire energy value chain. We support our customers on their way to a more sustainable future.

Products

Recent Posts

Hot Posts

Contact Form Demo (#3)
Scroll to Top

Request A Quote

Email:info@takomabattery.com