• DRBD Errors Caused By Physical Corruption


    In a RAID array I had a have periodically lost a drive here and there over the past several months. Iwas always able to readd and resync without losing data. However at some point it looks like some minor corruption happened and this makes DRBD unhappy. Using fsck did not help either. Dec 19 06:01:45 storageboxtest4 kernel: [19005.945890] EXT3-fs error (device drbd0): ext3_get_inode_loc: unable to read inode block - inode=22184379........
  • Why SMART is not smart at all and doesn't properly predict disk errors that cause a kernel panic or crash


    Before getting into the output here is my typical experience with SMART, there is what I call a "bad disk" with pending and uncorrectable sectors that cannot be reallocated. It has caused a kernel panic and system crash repeatedly as we can see from the logs. But SMART says it has "PASSED" its self assessment. SMART is still useful to me but it is more about looking at Current_Pending_Sector. Any time I have had anything but 0 for that attribute it........
  • DRBD Split-brain solution


    Uh oh [17925926.174277] block drbd0: Handshake successful: Agreed network protocol version 96 [17925926.174325] block drbd0: conn( WFConnection -> WFReportParams ) [17925926.174342] block drbd0: Starting asender thread (from drbd0_receiver [1682]) [17925926.174432] block drbd0: data-integrity-alg: [17925926.174581] block drbd0: drbd_sync_handshake: [17925926.174586] block drbd0: self 2AAE66AF9252D6DB:2815BF........
  • /dev/drbd0: State change failed: (-2) Need access to UpToDate data solution


    Everytime I've seen this error "/dev/drbd0: State change failed: (-2) Need access to UpToDate data" it is because DRBD has no disk: cat /proc/drbd version: 8.3.13 (api:88/proto:86-96) GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by root@sighted, 2012-10-09 12:47:51 0: cs:Connected ro:Secondary/Secondary ds:Diskless/Inconsistent A r----- ns:0 nr:0 dw:0 dr:0 al........
  • DRBD Slow Performance - 99.99 % [jbd2/drbd0-8] highiowait solution


    Tired of checking iotop and seeing that your drbd partition is using 99.99% of io all the time and finding your drbd device performs slow in general? This is especially an issue in versions of DRBD in the 8.3 tree in particular one documented case is on "8.3.13" but it likely applies to other devices. The symptoms are that resyncing is fine and normal but any reasonable amount of activity is very slow and lagged and creates a high server load and con........
  • 99.99 % [jbd2/drbd0-8] highiowait


    I have not found the source of this but essentially it seems like drbd and ext4 may not play well but I have to confirm still. In either case an older DRBD setup with older hard drives seems to have little to no iowait, but the main difference is the drbd partition is ext3 and not ext4. I will experiment and see if that fixes this, then we will know that DRBD and ext4 have issues.........
  • drbd 8.3 hard drive failure recovery example


    drbd 8.3 hard drive failure recovery drbdadm attach r0 DRBD module version: 8.3.10 userland version: 8.3.8 you should upgrade your drbd tools! 0: Failure: (119) No valid meta-data signature found. ==> Use 'drbdadm create-md res' to initialize meta-data area. ........
  • drbd won't create device if previous partition is on it Command 'drbdmeta 0 v08 /dev/md160 internal create-md' terminated with exit code 40


    This is what fixed it: [root@box13 ~]# dd if=/dev/zero of=/dev/md160 bs=512 count=500 Basically you need to wipe out more than just the 512 byte partition table so 512 bytes * 500 is more than enough to make DRBD happy and think the partition is now empty. The reason this happens is because it gets confused when there is a previous partition with data on the device you are using. root@box13 ~]# d........
  • drbd won't sync 8.3.13 on OpenVZ kernel


    I used the matching 8.3.13 utilities and it didn't work but strangely the newer 8.3.16 which makes DRBD complain works just fine. GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by root@sighted, 2012-10-09 12:47:51  0: cs:SyncSource ro:Secondary/Primary ds:UpToDate/Inconsistent A r-----     ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:5236960   &am........
  • drbd howto solve splitbrain or WFConnection


    On primary node drbdadm connect all On secondary node drbdadm -- --discard-my-data connect all ........
  • pxe-32 tftp open timeout


    pxe-32 tftp open timeout The solution was to enable tftp in xinetd with "chkconfig tftp on". See the troubleshooting below: chkconfig --list NetworkManager 0:off 1:off 2:off 3:off 4:off 5:off 6:off acpid 0:off&n........
  • How to recover from dead DRBD partition/hard drive in two simple commands


    This assumes that you've at least created the correct partition for your DRBD already. Notice that I am "diskless", that's because either your DRBD partition doesn't exist/has been renamed (eg. sdb becomes sda when sdb dies and you reboot) or because that drive is really actually dead/gone. *If you need to permanently change the partition/device for your resource be sure to edit /etc/drbd.conf on both hosts and reload the config. (replace r0 with........
  • Ubuntu 9.04 Crash


    CPU/Kernel/MB/RAID problem? Jan 5 12:45:05 testbox kernel: [653298.890004] BUG: soft lockup - CPU#0 stuck for 61s! [hal-acl-tool:4168] Jan 5 12:45:05 testbox kernel: [653298.890005] Modules linked in: vmnet vmci vmmon binfmt_misc drbd video output input_polldev ocfs2_stackglue ocfs2_dlmfs ocfs2_dlm ocfs2_nodemanager configfs k8temp hwmon_vid lp snd_hda_intel snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi........
  • Ubuntu/Debian DRBD 8.0 Setup Guide


    I've only used it on Centos, soI thought I'd make a quick Debian guide: Install the DRBD Package apt-get install drbd8-utils Reading package lists... Done Building dependency tree Reading state information... Done The following packages were automatically installed and are no longer required: libswfdec-0.8-0 Use 'apt-get autoremove' to remove them. The following........
  • mdadm RAID 1 adventures


    I separated the 2 drives in the RAID 1 array. 1 is the old one /dev/sda and is out of date, while the separated other one /dev/sdc was in another drive and mounted and used with more data (updated). I wonder how mdadm will handle this: usb-storage: device scan complete md: md127 stopped. md: bind md: md127: raid array is not clean -- starting background reconstruction raid1: raid set md127 active with 1 out of 2 m........
  • DRBD WFConnection Problem/Solution


    This has stumped me a few times because I keep forgetting that Centos 5.5 comes with a default iptables configuration that ends up blocking DRBD traffic,I tried all the normal things and couldn't understand why I couldn't make my normal DRBD config work. So if you have WFConnection problems and have tried the normal "mailing list" fixes, check your firewall status first! Both Nodes Say the Following: version: 8.3.8 (api:88/prot........
  • heartbeat is stopped for some reason


    heartbeat is stopped for some reason Anyway hnode2 was active and the services are running fine but I see heartbeat has been stopped somehow. Here is the last log I see of heartbeat: [quote:23c84415f5] Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: MSG stats: 9/1762471 ms age 0 [pid16738/MST_CONTROL] Sep 9 17:15:32 hnode2 heartbeat: [16738]: info: cl_malloc stats: 716/51784021 152624/74519 [pid16738/MST_CONTROL] Sep 9 17:15:32........
  • Clustering Links


    Clustering LinksI thought this might be interesting for people with spare time. [b:6423c19973]Great clustering article from Linux Mag[/b:6423c19973] http://www.linux-mag.com/2003-11/clusters_01.html [b:6423c19973]General Linux cluster information[/b:6423c19973] http://www.gdargaud.net/Hack/ClusterNotes.html#HighA http://www.faqs.org/docs/Linux-HOWTO/Cluster-HOWTO.html#s3 http://www.yolinux.com/TUTORIALS/LinuxClustersAndFileSys........
  • Unixbench Score with Glusterfs/Openvz & Quad Core Xeon - Updated with GlusterFS 2.0.8 & Optimized Client Config


    The results are still not flattering and are nothing close to native performance. Unless GlusterFS has a "DRBD-like" option to delay writes over the network and to only read from the client side, I don't see how performance can ever improve much more. After doing some client optimizations Iadded more to the score: Start Benchmark Run: Sun Nov 29 00:37:44 PST 2009 00:37:44 up 3 min, 1 user, load average: 0.01........
  • Clustered/Distributed Network Filesystems, Which Ones live up to the hype?


    I've tried to find a good sensible solution to cluster with and each technology has it's pros and cons and there is no perfect solution and I've found a lot of "exaggerations" in the applications, benefits and performance of these different filesystems. DRBD I first started off with DRBD and Ihave to say it does live up to the hype, is quite reliable (although it can be annoying to match up the kernel module and user applications since they must match and whe........
  • Latest Articles

  • How To Add Windows 7 8 10 11 to GRUB Boot List Dual Booting
  • How to configure OpenDKIM on Linux with Postfix and setup bind zonefile
  • Debian Ubuntu 10/11/12 Linux how to get tftpd-hpa server setup tutorial
  • efibootmgr: option requires an argument -- 'd' efibootmgr version 15 grub-install.real: error: efibootmgr failed to register the boot entry: Operation not permitted.
  • Apache Error Won't start SSL Cert Issue Solution Unable to configure verify locations for client authentication SSL Library Error: 151441510 error:0906D066:PEM routines:PEM_read_bio:bad end line SSL Library Error: 185090057 error:0B084009:x509 certif
  • Linux Debian Mint Ubuntu Bridge br0 gets random IP
  • redis requirements
  • How to kill a docker swarm
  • docker swarm silly issues
  • isc-dhcp-server dhcpd how to get longer lease
  • nvidia cannot resume from sleep Comm: nvidia-sleep.sh Tainted: Linux Ubuntu Mint Debian
  • zfs and LUKS how to recover in Linux
  • [error] (28)No space left on device: Cannot create SSLMutex Apache Solution Linux CentOS Ubuntu Debian Mint
  • Save money on bandwidth by disabling reflective rpc queries in Linux CentOS RHEL Ubuntu Debian
  • How to access a disk with bad superblock Linux Ubuntu Debian Redhat CentOS ext3 ext4
  • ImageMagick error convert solution - convert-im6.q16: cache resources exhausted
  • PTY allocation request failed on channel 0 solution
  • docker error not supported as upperdir failed to start daemon: error initializing graphdriver: driver not supported
  • Migrated Linux Ubuntu Mint not starting services due to broken /var/run and dbus - Failed to connect to bus: No such file or directory solution
  • qemu-system-x86_64: Initialization of device ide-hd failed: Failed to get